The bowerbirds in New Guinea and Australia include species that build the largest and perhaps most elaborately decorated constructions outside of humans. The males use these courtship bowers, along with their displays, to attract females. In these species, the mating system is polygynous and the females alone incubate and feed the nestlings. The bowerbirds also include 10 species of the socially monogamous catbirds in which the male participates in most aspects of raising the young. How the bower-building behavior evolved has remained poorly understood, as no comprehensive phylogeny exists for the family. It has been assumed that the monogamous catbird clade is sister to all polygynous species. We here test this hypothesis using a newly developed pipeline for obtaining homologous alignments of thousands of exonic and intronic regions from genomic data to build a phylogeny. Our well-supported species tree shows that the polygynous, bower-building species are not monophyletic. The result suggests either that bower-building behavior is an ancestral condition in the family that was secondarily lost in the catbirds, or that it has arisen in parallel in two lineages of bowerbirds. We favor the latter hypothesis based on an ancestral character reconstruction showing that polygyny but not bower-building is ancestral in bowerbirds, and on the observation that Scenopoeetes dentirostris, the sister species to one of the bower-building clades, does not build a proper bower but constructs a court for male display. This species is also sexually monomorphic in plumage despite having a polygynous mating system. We argue that the relatively stable tropical and subtropical forest environment in combination with low predator pressure and rich food access (mostly fruit) facilitated the evolution of these unique life-history traits.
A major concern in molecular clock dating is how to use information from the fossil record to calibrate genetic distances from DNA sequences. Here we apply three Bayesian dating methods that differ in how calibration is achieved—“node dating” (ND) inBEAST, “total evidence” (TE) dating in MrBayes, and the “fossilized birth–death” (FBD) in FDPPDiv—to infer divergence times in the royal ferns. Osmundaceae have 16–17 species in four genera, two mainly in the Northern Hemisphere and two in South Africa and Australasia; they are the sister clade to the remaining leptosporangiate ferns. Their fossil record consists of at least 150 species in ∼17 genera. For ND, we used the five oldest fossils, whereas for TE and FBD dating, which do not require forcing fossils to nodes and thus can use more fossils,we included up to 36 rhizomes and frond compression/impression fossils, which for TE datingwere scored for 33morphological characters.We also subsampled 10%, 25%, and 50% of the 36 fossils to assess model sensitivity. FBD-derived divergence ages were generally greater than those inferred from ND; two of seven TE-derived ages agreed with FBD-obtained ages, the others were much younger or much older than ND or FBD ages. We prefer the FBD-derived ages because they best fit the Osmundales fossil record (including Triassic fossils not used in our study). Under the preferred model, the clade encompassing extant Osmundaceae (and many fossils) dates to the latest Paleozoic to Early Triassic; divergences of the extant species occurred during the Neogene. Under the assumption of constant speciation and extinction rates, the FBD approach yielded speciation and extinction rates that overlapped those obtained from just neontological data. However, FBD estimates of speciation and extinction are sensitive to violations in the assumption of continuous fossil sampling; therefore, these estimates should be treated with caution.
Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution.
Phylogenomic studies have improved understanding of deep metazoan phylogeny and show promise for resolving incongruences among analyses based on limited numbers of loci. One region of the animal tree that has been especially difficult to resolve, even with phylogenomic approaches, is relationships within Lophotrochozoa (the animal clade that includes molluscs, annelids, and flatworms among others). Lack of resolution in phylogenomic analyses could be due to insufficient phylogenetic signal, limitations in taxon and/or gene sampling, or systematic error. Here, we investigated why lophotrochozoan phylogeny has been such a difficult question to answer by identifying and reducing sources of systematic error. We supplemented existing data with 32 new transcriptomes spanning the diversity of Lophotrochozoa and constructed a new set of Lophotrochozoa-specific core orthologs. Of these, 638 orthologous groups (OGs) passed strict screening for paralogy using a tree-based approach. In order to reduce possible sources of systematic error, we calculated branch-length heterogeneity, evolutionary rate, percent missing data, compositional bias, and saturation for each OG and analyzed increasingly stricter subsets of only the most stringent (best) OGs for these five variables. Principal component analysis of the values for each factor examined for each OG revealed that compositional heterogeneity and average patristic distance contributed most to the variance observed along the first principal component while branch-length heterogeneity and, to a lesser extent, saturation contributed most to the variance observed along the second. Missing data did not strongly contribute to either. Additional sensitivity analyses examined effects of removing taxa with heterogeneous branch lengths, large amounts of missing data, and compositional heterogeneity. Although our analyses do not unambiguously resolve lophotrochozoan phylogeny, we advance the field by reducing the list of viable hypotheses. Moreover, our systematic approach for dissection of phylogenomic data can be applied to explore sources of incongruence and poor support in any phylogenomic dataset.
Reconstructing the biogeographic history of groups present in continuous arid landscapes is challenging dueto the difficulties in defining discrete areas for analyses, and even more so when species largely overlap both in terms ofgeography and habitat preference. In this study, we use a novel approach to estimate ancestral areas for the small plantgenus Centipeda. We apply continuous diffusion of geography by a relaxed random walk where each species is sampledfrom its extant distribution on an empirical distribution of time-calibrated species-trees. Using a distribution of previouslypublished substitution rates of the internal transcribed spacer (ITS) for Asteraceae, we show how the evolution of Centipedacorrelates with the temporal increase of aridity in the arid zone since the Pliocene. Geographic estimates of ancestral speciesshow a consistent pattern of speciation of early lineages in the Lake Eyre region, with a division in more northerly andsoutherly groups since ∼840 ka. Summarizing the geographic slices of species-trees at the time of the latest speciation event(∼20 ka), indicates no presence of the genus in Australia west of the combined desert belt of the Nullabor Plain, the GreatVictoria Desert, the Gibson Desert, and the Great Sandy Desert, or beyond the main continental shelf of Australia. Theresult indicates all western occurrences of the genus to be a result of recent dispersal rather than ancient vicariance. Thisstudy contributes to our understanding of the spatiotemporal processes shaping the flora of the arid zone, and offers asignificant improvement in inference of ancestral areas for any organismal group distributed where it remains difficult todescribe geography in terms of discrete areas.
Tropical forests of Central and South America represent hotspots of biological diversity. Tree squirrels of the tribe Sciurini are an excellent model system for the study of tropical biodiversity as these squirrels disperse exceptional distances, and after colonizing the tropics of the Central and South America, they have diversified rapidly. Here, we compare signals from DNA sequences with morphological signals using pictures of skulls and computational simulations. Phylogenetic analyses reveal step-wise geographic divergence across the Northern Hemisphere. In Central and South America, tree squirrels form two separate clades, which split from a common ancestor. Simulations of ancestral distributions show western Amazonia as the epicenter of speciation in South America. This finding suggests that wet tropical forests on the foothills of Andes possibly served as refugia of squirrel diversification during Pleistocene climatic oscillations. Comparison of phylogeny and morphology reveals one major discrepancy: Microsciurus species are a single clade morphologically but are polyphyletic genetically. Modeling of morphology-diet relationships shows that the only group of species with a direct link between skull shape and diet are the bark-gleaning insectivorous species of Microsciurus. This finding suggests that the current designation of Microsciurus as a genus is based on convergent ecologically driven changes in morphology.
Oceanic islands originate fromvolcanism or tectonic activity without connections to continental landmasses, are colonized by organisms, and eventually vanish due to erosion and subsidence. Colonization of oceanic islands occurs through long-distance dispersals (LDDs) or metapopulation vicariance, the latter resulting in lineages being older than the islands they inhabit. If metapopulation vicariance is valid, island ages cannot be reliably used to provide maximum age constraints for molecular dating.We explore the relationships between the ages of members of a widespread plant genus (Planchonella, Sapotaceae) and their host islands across the Pacific to test various assumptions of dispersal and metapopulation vicariance. We sampled three nuclear DNA markers from 156 accessions representing some 100 Sapotaceae taxa, and analyzed these in BEAST with a relaxed clock to estimate divergence times and with a phylogeographic diffusion model to estimate range expansions over time. The phylogeny was calibrated with a secondary point (the root) and fossils from New Zealand. The dated phylogeny reveals that the ages of Planchonella species are, in most cases, consistent with the ages of the islands they inhabit. Planchonella is inferred to have originated in the Sahul Shelf region, to which it back-dispersed multiple times. Fiji has been an important source for range expansion in the Pacific for the past 23 myr. Our analyses reject metapopulation vicariance in all cases tested, including between oceanic islands, evolution of an endemic Fiji–Vanuatu flora, and westward rollback vicariance between Vanuatu and the Loyalty Islands. Repeated dispersal is the only mechanism able to explain the empirical data. The longest (8900 km) identified dispersal is between Palau in the Pacific and the Seychelles in the Indian Ocean, estimated at 2.2 Ma (0.4–4.8 Ma). The first split in a Hawaiian lineage (P. sandwicensis) matches the age of Necker Island (11.0Ma), when its ancestor diverged into two species that are distinguished by purple and yellowfruits. Subsequent establishment across the Hawaiian archipelago supports, in part, progression rule colonization. In summary, we found no explanatory power in metapopulation vicariance and conclude that Planchonella has expanded its range across the Pacific by LDD.We contend that this will be seen in many other groups when analyzed in detail.
We modified the phylogenetic program MrBayes 3.1.2 to incorporate the compound Dirichlet priors for branch lengths proposed recently by Rannala, Zhu, and Yang (2012. Tail paradox, partial identifiability and influential priors in Bayesian branch length inference. Mol. Biol. Evol. 29:325-335.) as a solution to the problem of branch-length overestimation in Bayesian phylogenetic inference. The compound Dirichlet prior specifies a fairly diffuse prior on the tree length (the sum of branch lengths) and uses a Dirichlet distribution to partition the tree length into branch lengths. Six problematic data sets originally analyzed by Brown, Hedtke, Lemmon, and Lemmon (2010. When trees grow too long: investigating the causes of highly inaccurate Bayesian branch-length estimates. Syst. Biol. 59:145-161) are reanalyzed using the modified version of MrBayes to investigate properties of Bayesian branch-length estimation using the new priors. While the default exponential priors for branch lengths produced extremely long trees, the compound Dirichlet priors produced posterior estimates that are much closer to the maximum likelihood estimates. Furthermore, the posterior tree lengths were quite robust to changes in the parameter values in the compound Dirichlet priors, for example, when the prior mean of tree length changed over several orders of magnitude. Our results suggest that the compound Dirichlet priors may be useful for correcting branch-length overestimation in phylogenetic analyses of empirical data sets.
A Bayesian coalescent-based method has recently been proposed to delimit species using multilocus genetic sequence data. Posterior probabilities of different species delimitation models are calculated using reversible-jump Markov chain Monte Carlo algorithms. The method accounts for species phylogenies and coalescent events in both extant and extinct species and accommodates lineage sorting and uncertainties in the gene trees. Although the method is theoretically appealing, its utility in practical data analysis is yet to be rigorously examined. In particular, the analysis may be sensitive to priors on ancestral population sizes and on species divergence times and to gene flow between species. Here we conduct a computer simulation to evaluate the statistical performance of the method, such as the false negatives (the error of lumping multiple species into one) and false positives (the error of splitting one species into several). We found that the correct species model was inferred with high posterior probability with only one or two loci when 5 or 10 sequences were sampled from each population, or with 50 loci when only one sequence was sampled. We also simulated data allowing migration under a two-species model, a mainland-island model and a stepping-stone model to assess the impact of gene flow (hybridization or introgression). The behavior of the method was diametrically different depending on the migration rate. Low rates at < 0.1 migrants per generation had virtually no effect, so that the method, while assuming no hybridization between species, identified distinct species despite small amounts of gene flow. This behavior appears to be consistent with biologists' practice. In contrast, higher migration rates at ≥ 10 migrants per generation caused the method to infer one species. At intermediate levels of migration, the method is indecisive. Our results suggest that Bayesian analysis under the multispecies coalescent model may provide important insights into population divergences, and may be useful for generating hypotheses of species delimitation, to be assessed with independent information from anatomical, behavioral, and ecological data.