Phylogenetic Reconstruction and Divergence Time Estimation of Blumea DC. (Asteraceae: Inuleae) in China Based on nrDNA ITS and cpDNA trnL-F Sequences

The genus Blumea is one of the most economically important genera of Inuleae (Asteraceae) in China. It is particularly diverse in South China, where 30 species are found, more than half of which are used as herbal medicines or in the chemical industry. However, little is known regarding the phylogenetic relationships and molecular evolution of this genus in China. We used nuclear ribosomal DNA (nrDNA) internal transcribed spacer (ITS) and chloroplast DNA (cpDNA) trnL-F sequences to reconstruct the phylogenetic relationship and estimate the divergence time of Blumea in China. The results indicated that the genus Blumea is monophyletic and it could be divided into two clades that differ with respect to the habitat, morphology, chromosome type, and chemical composition of their members. The divergence time of Blumea was estimated based on the two root times of Asteraceae. The results indicated that the root age of Asteraceae of 76–66 Ma may maintain relatively accurate divergence time estimation for Blumea, and Blumea might had diverged around 49.00–18.43 Ma. This common ancestor had an explosive expansion during the Oligocene and Miocene and two major clades were differentiated during these epochs 29.60 Ma (17.76–45.23 Ma 95% HPD (Highest Posterior Density). Evidence from paleogeography and paleoclimate studies has confirmed that Blumea experienced differentiation and an explosive expansion during the Oligocene and Miocene.


Introduction
The genus Blumea is one of the largest genera of Inuleae (Asteraceae), containing approximately 100 species [1][2][3]. It is most diverse in Asia, Africa, and Australia, and it has more than one main center of diversity in Africa and South Asia [3]. China is one center of diversity of this genus, with 30 species being distributed throughout South China, of which five are endemic [4,5]. Blumea DC. has economic and ecological value in China. More than half of the species that belong to this genus have medical or ethnobotanical value [6]; for example, Blumea balsamifera (L.) DC. is an economic source of L-borneol or camphor extraction, and it is widely cultivated in the Philippines and China [7][8][9]. Blumea megacephala (Randeria) Chang et Tseng and Blumea riparia (Bl.) DC. are important sources of medical materials for "fuxuekang" [8]. Other species, such as Blumea aromatica DC., Blumea formosana Kitam, and Blumea densiflora DC., are used to treat rheumatism, esophagitis, headache, and hypertension [6]. Zhang and Cheng first described the taxonomy, plant morphology, and anatomy of the genus Blumea in China [4]; the authors treated six subsections and 30 species. Since the first description of this genus, its phylogenetic analysis and molecular evolution have been vigorously disputed. Phylogenetic analysis using molecular markers has been widely used to attempt to resolve this dispute; Pornpongrungrueng conducted important research [10,11], whose results showed that Blumea is monophyletic if the genera Blumeopsis Gagnep. and Merrittia Merr. are included, and suggested that this genus could be divided into two main clades that differ with respect to habitat, ecology, and distribution; however, few samples of the Blumea species from China were included [10,11]. In this study, we extended the molecular phylogenetic analysis of Blumea by (1) adding several samples from China to provide more thorough coverage of the overall distribution and diversity of Blumea, and (2) performing divergence time estimation. We aimed to: (1) estimate the molecular evolution and phylogenetic relationships of Blumea with a focus on the genus in China and (2) offer the best hypothesis for the divergence time and evolutionary events of this genus in China.
The results of the best substitution models from jmodeltest 2.1.7 with Akaike information criterion (AIC) [16] and Bayesian information criterion (BIC) [17] indicated that the best substitution models for ITS, trnL-F, or combined sequnces were, respectively, as SYM (symmetrical model) + I (proportion of invariable sites) + G (gamma distribution), TVM (transversion model) + G, and GTR (general time reversible) + I + G (Table 1); Table 1 also lists other parameters. Based on the sequence of nrDNA ITS, with four different phylogenetic analysis methods-Neighbor-joining (NJ), Maximum Parsimony (MP), Maximum Likelihood (ML), and Bayesian inference (BI)-the results of all indicated that the Blumea genus was monophyletic, but the topology of the four methods was different. For example, the topology of the NJ method ( Figure 1A) indicated that Blumea could be divided into two clades. The phylogenetic relationship of the B. aromatica clade may be clear, but the phylogenetic relationships of the others were disordered, also with very low posterior probabilities. The results from the MP method ( Figure 1B) seem to mostly conform to the traditional classification relationship of Blumea; the genus could be divided into three clades: clade B. balsamifera was the basal clade, the B. aromatica clade may be the transitional clade, and others may be the higher group with more extensive environmental adaptability, such as with one-year lifetime, diverse habitats, and morphological characteristics. The results of the BI method ( Figure 1C) seem to be the same as those of the NJ method; also, with high posterior probabilities, but Caesulia axillaris Roxb is placed in the B. aromatica clade. The topology with different methods was compared with TreePuzzle 5.3.rc16 [18] and CONSEL [19], and the results indicated that the ML method was the most suitable tree-reconstruction method for ITS sequences and with first rank ( Table 2). The genus of Blumea could be divided into two major clades based on the phylogenetic tree with ITS sequence and the MP method ( Figure 1D    With the cpDNA trnL-F sequence, the results of the four different methods also indicated that the genus Blumea was divided into two clades, but the relationships that were revealed by the four methods were different, especially for the species in clade II ( Figure 2). Combining the topology results indicated that the Bayesian inference (BI) method was possibly the most suitable tree reconstruction method (Table 2). With the BI method, Blumea could also been divided into two clades: clade I, with the same members of the nrDNA ITS dataset, consisting of three species of Macrophyllae (B. aromatica, B. densiflora, and B. balsamifera) and one species of section Paniculatae (B. lanceolaria), and the other species were combined into clade II, but with low posterior probabilities ( Figure 2D). The same results were observed with the NJ, MP, and ME methods (Figures 2A-C): the genus could be divided into two clades, but with a low bootstrap value, especially in the With the cpDNA trnL-F sequence, the results of the four different methods also indicated that the genus Blumea was divided into two clades, but the relationships that were revealed by the four methods were different, especially for the species in clade II ( Figure 2). Combining the topology results indicated that the Bayesian inference (BI) method was possibly the most suitable tree reconstruction method ( Table 2). With the BI method, Blumea could also been divided into two clades: clade I, with the same members of the nrDNA ITS dataset, consisting of three species of Macrophyllae (B. aromatica, B. densiflora, and B. balsamifera) and one species of section Paniculatae (B. lanceolaria), and the other species were combined into clade II, but with low posterior probabilities ( Figure 2D). The same results were observed with the NJ, MP, and ME methods (  Furthermore, the results of the combined sequences indicated that the BI method was also the most suitable tree-reconstructed method for combined sequences (Table 2), with the same topology based on the nrDNA ITS sequence when analyzed with all four methods, (Figure 3). The genus Blumea could be divided into two clades; clade II also consisted of three subclades and two single Furthermore, the results of the combined sequences indicated that the BI method was also the most suitable tree-reconstructed method for combined sequences (Table 2), with the same topology based on the nrDNA ITS sequence when analyzed with all four methods, (Figure 3). The genus Blumea could be divided into two clades; clade II also consisted of three subclades and two single species ( Figure 3D). The same thing was observed from the results of the NJ, MP, and ME methods ( Figure 3A-C), and with species ( Figure 3D). The same thing was observed from the results of the NJ, MP, and ME methods ( Figure 3A-C), and with a better bootstrap posterior supported value. This means that the combined sequence can improve the lower posterior probabilities or bootstrap that is caused by the single sequence.

Divergence Time Estimation and Evolutionary Event Hypothesis
A series of evolutionary events and divergence times were estimated based on a comparison of the two calibration points of the root age of Asteraceae and geography-species differentiation time (Figures 4 and 5). With the root age of Asteraceae as 49-42 Ma [20][21][22] (Figure 4), the time of genus Blumea differentiated from Inuleae is estimated to be 23.20 Ma, with the highest posterior density (HPD) between 32.86 and 14.51 Ma (the time between the late Oligocene and the early Miocene). The divergence time of the two major clades in Blumea is estimated to be 20.12 Ma (with HPD between 28.89 and 12.40 Ma). Further investigation with BAMM [23] and BAMMtools [24] showed

Divergence Time Estimation and Evolutionary Event Hypothesis
A series of evolutionary events and divergence times were estimated based on a comparison of the two calibration points of the root age of Asteraceae and geography-species differentiation time (Figures 4 and 5). With the root age of Asteraceae as 49-42 Ma [20][21][22] (Figure 4), the time of genus Blumea differentiated from Inuleae is estimated to be 23.20 Ma, with the highest posterior density (HPD) between 32.86 and 14.51 Ma (the time between the late Oligocene and the early Miocene). The divergence time of the two major clades in Blumea is estimated to be 20.12 Ma (with HPD between 28.89 and 12.40 Ma). Further investigation with BAMM [23] and BAMMtools [24] showed that this genus underwent an explosive expansion during the Miocene, and many new species speciate during this period ( Figure 4A). that this genus underwent an explosive expansion during the Miocene, and many new species speciate during this period ( Figure 4A).  [23] and BAMMtools [24] showed that ( Figure 4A) this genus experienced an explosive expansion, since it differentiated from Inuleae.  [23] and BAMMtools [24] showed that ( Figure 4A) this genus experienced an explosive expansion, since it differentiated from Inuleae.  (Figures 4 and 5). A third web tool, TimeTree (http://www.timetree.org/) [28], was used for further divergence time estimation and for making the final decision. The results indicated that the median divergence time of B. balsamifera and Plumea carolinensis (Jacq.) G. Don. may have been in the early Miocene, around 26 Ma ago ( Figure 6). Combining the results of TimeTree with the two calibration points for the root age of Asteraceae, we think that the root age of Asteraceae of 76-66 Ma [27] may be relatively accurate divergence time estimation for Blumea. First, based on the paleogeography and paleoclimate of the Oligocene (Figure 6), the Oligocene interrupted the temperature decline in the Paleogene (~33.5 Ma ago), and a stepwise climate increase began at 32.5 Ma and lasted until 25.5 Ma ago [29,30], also accompanied with a reduction of the CO2/O2 rate. This same time was the most important period for grassland and forest expansion, and the sunflower family also underwent an explosive expansion during this time [27]. The early Miocene (~20 Ma ago) is the time of formation and differentiation of Malaysian flora [31]. The old Chinese mainland, because of the relatively warm climate, was the most fertile area on Earth, had become one diversity center of earth. The genus of Blumea, which mostly originated from palaeotropical region, had moved to south China, also accompanied with an explosive expansion for more herbal members, such as B. saussureoides, B. sinuata, B. oblongifolia, B. lacera, and others that are more suitable for the low CO2/O2 rate.  (Figures 4 and 5). A third web tool, TimeTree (http: //www.timetree.org/) [28], was used for further divergence time estimation and for making the final decision. The results indicated that the median divergence time of B. balsamifera and Plumea carolinensis (Jacq.) G. Don. may have been in the early Miocene, around 26 Ma ago ( Figure 6). Combining the results of TimeTree with the two calibration points for the root age of Asteraceae, we think that the root age of Asteraceae of 76-66 Ma [27] may be relatively accurate divergence time estimation for Blumea. First, based on the paleogeography and paleoclimate of the Oligocene (Figure 6), the Oligocene interrupted the temperature decline in the Paleogene (~33.5 Ma ago), and a stepwise climate increase began at 32.5 Ma and lasted until 25.5 Ma ago [29,30], also accompanied with a reduction of the CO 2 /O 2 rate. This same time was the most important period for grassland and forest expansion, and the sunflower family also underwent an explosive expansion during this time [27]. The early Miocene (~20 Ma ago) is the time of formation and differentiation of Malaysian flora [31]. The old Chinese mainland, because of the relatively warm climate, was the most fertile area on Earth, had become one diversity center of earth. The genus of Blumea, which mostly originated from palaeotropical region, had moved to south China, also accompanied with an explosive expansion for more herbal members, such as B. saussureoides, B. sinuata, B. oblongifolia, B. lacera, and others that are more suitable for the low CO2/O2 rate. Combining the evidence from paleogeography and paleoclimate and two root ages of Asteraceae, the originate time of genus Blumea may be estimated to be 49.00-18.43 Ma; the divergence time of the two major clades was further estimated to be 45.23-17.76 Ma ago. This genus experienced an explosive expansion during the Oligocene and Miocene.

Discussion
The genus Blumea is one of the largest genera of Inuleae (Asteraceae), being widely distributed in ancient tropical regions around the Pacific. South China is one diversity center of this genus, containing 30 species. Given the variety of its morphology, its phylogenetic analysis and trait characterization have been vigorously disputed since the first description of this genus. Phylogenetic analysis using molecular markers has been widely used to attempt to resolve this dispute [10,11]. Pornpongrungrueng [11] conducted a phylogenetic analysis on Blumea. The results showed that Blumea (including Blumeopsis flava) is monophyletic, and suggested that this genus could be divided into two main clades and one single species, B. balsamifera, which differ in terms of habitat, ecology, and distribution. However, few samples were obtained from species in China. In this study, 16 Chinese samples, which include 12 species belonging to three sections of Blumea, that were sequenced with nrDNA ITS and cpDNA trnL-F sequences, were used to estimate the phylogenetic relationship between the members of this genus. The results repeatedly verified the conclusions of Pornpongrungrueng [11], which indicated that this genus could be divided into two clades: clade I, including B. aromatica, B. densiflora, B. balsamifera, and B. lanceolaria, and the others belonging to clade II. When comparing the habitat, distribution, chromosome number, and chemical composition, we found that the two main clades differ in all of these aspects plus ecology, with the Blumea clade mostly containing perennial shrubs or subshrubs, which were mostly distributed from evergreen forests, and those with the same chromosome number (2n = 18) and chromosome structure (6M + 8m + 4sm) [32], and whose chemical composition is similar [33]. Clade II is a widespread paleotropical group that comprises mostly annual, weedy herbs of open forests and fields. Divergence time estimation and evolutionary event hypothesizing are important tools in Combining the evidence from paleogeography and paleoclimate and two root ages of Asteraceae, the originate time of genus Blumea may be estimated to be 49.00-18.43 Ma; the divergence time of the two major clades was further estimated to be 45.23-17.76 Ma ago. This genus experienced an explosive expansion during the Oligocene and Miocene.

Discussion
The genus Blumea is one of the largest genera of Inuleae (Asteraceae), being widely distributed in ancient tropical regions around the Pacific. South China is one diversity center of this genus, containing 30 species. Given the variety of its morphology, its phylogenetic analysis and trait characterization have been vigorously disputed since the first description of this genus. Phylogenetic analysis using molecular markers has been widely used to attempt to resolve this dispute [10,11]. Pornpongrungrueng [11] conducted a phylogenetic analysis on Blumea. The results showed that Blumea (including Blumeopsis flava) is monophyletic, and suggested that this genus could be divided into two main clades and one single species, B. balsamifera, which differ in terms of habitat, ecology, and distribution. However, few samples were obtained from species in China. In this study, 16 Chinese samples, which include 12 species belonging to three sections of Blumea, that were sequenced with nrDNA ITS and cpDNA trnL-F sequences, were used to estimate the phylogenetic relationship between the members of this genus. The results repeatedly verified the conclusions of Pornpongrungrueng [11], which indicated that this genus could be divided into two clades: clade I, including B. aromatica, B. densiflora, B. balsamifera, and B. lanceolaria, and the others belonging to clade II. When comparing the habitat, distribution, chromosome number, and chemical composition, we found that the two main clades differ in all of these aspects plus ecology, with the Blumea clade mostly containing perennial shrubs or subshrubs, which were mostly distributed from evergreen forests, and those with the same chromosome number (2n = 18) and chromosome structure (6M + 8m + 4sm) [32], and whose chemical composition is similar [33]. Clade II is a widespread paleotropical group that comprises mostly annual, weedy herbs of open forests and fields. Divergence time estimation and evolutionary event hypothesizing are important tools in phylogenetic analysis, but, it is difficult to estimate evolutionary events and divergence time because of the fossil record. Choosing some geological events as calibration points could provide more evidence for the process of phylogenetic evolution and divergence event hypothesis [34]. Most of the families have no fossil record because the fossil record information on Inuleae is scant. Several fossils are clearly identifiable as members of Asteraceae and Goodeniaceae from the Oligocene and, later, and there are seeds of Menyanthaceae and Campanulaceae from the Oligocene and Miocene, respectively [35]. There are also several records of Eocene pollen of Asteraceae found in South America or China [36,37]. As Oligocene pollen of the Asteraceae is of a comparatively specialized type and it is found on several continents, it is reasonable to assume that it dates back at least to the Oligocene-Eocene boundary (42)(43)(44)(45)(46)(47)(48)(49). Hind et al. estimated the same date [38]. However, according to a more recent paper by Barreda et al. [27], the most recent common ancestor of Asteraceae may have originated 76-66 Ma ago. In this research, by considering the time of formation of Hainan Island and Qiongzhou Strait (5.8-3.7 Ma) [39][40][41], we were able to estimate the divergence time of Blumea DC. The results indicated that Blumea may have originated from 49.00 to 18.43 Ma, and had an explosive expansion during the Oligocene and Miocene (45.23-17.76 Ma), when the two major clades differentiated. The evidence from paleogeography and paleoclimate are consistent with the conclusion that Blumea underwent differentiation and explosive expansion during the Oligocene and Miocene [29,30].

Plant Materials
In this study, 16 Chinese samples, including 12 species that belong to three sections of Blumea DC, were newly collected and sequenced, and the specimens of those samples were deposited in the herbarium of the Chinese Academy of Tropical Agricultural Sciences (herbarium code: CATCH). Nine reference samples of Blumea and seven samples of outgroups (seven species, six genera) were downloaded from Genbank. Table 3 provides the sequence information, locality, and other details of the accessions.

DNA Isolation
Genomic DNA was isolated from the above accessions while using the QIAGEN DNeasy Plant Minikit (QIAGEN, Düsseldorf, Germany), according to the manufacturer's instructions. The DNA was diluted to 30 ng/µL and the ultraviolet absorption values at A 260 were used.

DNA Amplification and Sequencing: ITS and trnL-F Sequence Amplification and Sequencing
Internal transcribed spacer (ITS) and trnL-F sequence amplification and analysis were conducted according to the previously established protocols [39,40]. The PCR mixture consisted of 60 ng DNA, 1.0 µM of each primer (Invitrogen Corp., Carlsbad, CA, USA), and 25.0 µL of 2× Taq PCR Master Mix (0.1 U/µL Taq polymerase, 500 µM of each dNTP, 20 mM Tris-HCl at pH 8.3, 100 mM KCl, 3 mM MgCl 2 ; Tiangen, Beijing, China) in a 50-µL volume. The PCR cycle protocol was based on previously established methods [42]. The PCR products that were obtained were separated on a 1.2% agarose gel, and then the bands of the expected size were excised from the gel and purified while using the QIAquick Gel Extraction Kit (QIAGEN Inc., Valencia, CA, USA), according to the manufacturer's instructions. The purified PCR products were subjected to sequencing (Invitrogen, Shanghai, China).

Single Sequence
First, the raw sequences of the ITS and trnL-F fragments (Invitrogen, Shanghai, China) were edited to remove the low-quality parts. The edited sequences were then submitted to GenBank ( Table 3). The sequences of both loci were aligned while using MEGA 7.0 [12] with the default options and saved in *.nexus and *.aln formats. The interleaved *.nexus files were converted to the non-interleaved *.nex format while using PAUP 4.0a164 [15]. The sequence saturation and best substitution model were analyzed using DAMBE [13] and jmodeltest 2.1.7 [14], and then the poorly aligned positions and divergent regions in each aligned sequence were removed using Gblocks0.91b [43,44], using the filtration parameters that were described by Guillou [44].  The datasets of ITS and trnL-F sequences were loaded into PAUP 4.0 (https://paup.phylosolutions. com/) [14] for compatibility testing (ILD test) [47]. The compatibility testing (ILD test) revealed no incongruence between ITS and trnL-F sequences (p = 0.19). Subsequently, the sequences were merged while using SequenceMatrix1.8 [48] and saved in *.nex format.

Neighbor-Joining (NJ) Method
The NJ analysis was performed with PAUP version 4.0a164 [15] with heuristic searches under the random option of the stepwise addition algorithm with equal weighting of all characters. Bootstrap 50% majority-rule consensus trees were reconstructed by performing heuristic searches with 1000 bootstrap replicates.

Maximum Parsimony (MP) Method
The same as the NJ method, an MP analysis was reconstructed with PAUP 4.0a164 [15], performing heuristic searches with 1000 bootstrap replicates.

Maximum Likelihood (ML) Method
Before ML analysis with IQ-TREE [49,50], the best substitution models of the single or combined sequences were computed with jmodeltest 2.1.7 [14]. Subsequently, the phylogenetic tree with the ML method was analyzed with IQ-TREE [49,50] using the default settings, with 1000 bootstrap (BP) values for tree evaluation.

Bayesian inference (BI) method
With the best substitution models and parameters that were computed by jmodeltest 2.1.7 [14], BI analysis was performed with MrBayes 3.2.6 [51], but not all of the substitution models computed by jmodeltest 2.1.7 [14] were available in MrBayes, so an General time reversible (GTR) model with unequal rates and unequal base freq) model were used for BI analysis, and the base frequency of the nucleic acids and transition/transversion rates were set according the results from jmodeltest 2.1.7 [14]. The analysis parameters were set as four chains that were run simultaneously for 10,000,000 generations or until the average standard deviation of the split frequencies fell below 0.01. The trees were sampled with every 100 generations and a total of 20,000 trees were generated with the initial sample. Subsequently, the consensus tree was summarized with LogCombiner 1.10.4 and TreeAnnotator 1.10.4, and the parameters were first set to 10%, but then discarded as burn-in, and a 50% majority-rule consensus tree with posterior probability (PP) values was used.

Time Node Selection and Correction
As there are no unequivocal fossils for Blumea DC and its relatives in Inuleae, the divergence and diversification times were difficult to estimate. Two root ages of Asteraceae were compared to accurately estimate divergence time estimation for Blumea, in this study: the recent results from Barreda et al. [27], who estimated the root age of Asteraceae at 76-66 Ma; and, the time generally accepted by other researchers, . Subsequently, the geological ages of Qiongzhou Strait and Hainan Island, 5.8-3.7 Ma, were selected as the latest differentiation time of B. aromatica DC samples from Guizhou and Hainan [38,40,41]. To investigate the speciation rates of Blumea DC along with time, BAMM (http://bamm-project.org/) [23] and BAMMtools (https: //cran.rstudio.com/web/packages/BAMMtools/index.html) [24] were used to compare the speciation rates with the two hypotheses of the root age of Asteraceae.

Divergence Time Estimation and Evolutionary Event Hypothesis
All of the divergence times and evolutionary events were hypothesized while using BEAST 1.10.4 (http://beast.community/) [25]. First, the likelihood ratio test (LRT) [52] was used to determine that the ITS or trnL-F data conformed to the molecular clock hypothesis. The results indicated that the Blumea data were not suitable for the molecular clock hypothesis (p = 0.03). Afterwards, the divergence times were analyzed while using BEAST 1.10.4 [25], incorporating an uncorrelated lognormal clock model and a birth-death speciation process, and the nucleotide substitution model settings were compared with the jmodeltest results. The parameters were set to be similar to those used in MrBayes 3.2.6 during the analysis process [51], and with four gamma Categories, 10,000,000 generations for four independent Markov chain Monte Carlo (MCMC) runs, sampling every 10,000 generations, with substitution and clock models being unlinked between partitions. The convergence and effective sample size (>200) of parameters were analyzed with Tracer 1.7. LogCombiner 1.10.4 and TreeAnnotator 1.10.4 were used for summarizing the trees, and the parameters, which were set at first to 10%, were discarded as burn-in, and a 50% majority-rule consensus tree with PP values was used instead. Lastly, a strap package [26] in R was used for visualization of the results of BEAST to reflect the divergence time of Blumea, especially the geological time scale.
Funding: This research was funded by the National Natural Science Foundation of China, grant numbers 81374065 and 81403035.

Conflicts of Interest:
The authors declare no conflict of interest.