scholarly journals gen3sis: A general engine for eco-evolutionary simulations of the processes that shape Earth’s biodiversity

PLoS Biology ◽  
2021 ◽  
Vol 19 (7) ◽  
pp. e3001340
Author(s):  
Oskar Hagen ◽  
Benjamin Flück ◽  
Fabian Fopp ◽  
Juliano S. Cabral ◽  
Florian Hartig ◽  
...  

Understanding the origins of biodiversity has been an aspiration since the days of early naturalists. The immense complexity of ecological, evolutionary, and spatial processes, however, has made this goal elusive to this day. Computer models serve progress in many scientific fields, but in the fields of macroecology and macroevolution, eco-evolutionary models are comparatively less developed. We present a general, spatially explicit, eco-evolutionary engine with a modular implementation that enables the modeling of multiple macroecological and macroevolutionary processes and feedbacks across representative spatiotemporally dynamic landscapes. Modeled processes can include species’ abiotic tolerances, biotic interactions, dispersal, speciation, and evolution of ecological traits. Commonly observed biodiversity patterns, such as α, β, and γ diversity, species ranges, ecological traits, and phylogenies, emerge as simulations proceed. As an illustration, we examine alternative hypotheses expected to have shaped the latitudinal diversity gradient (LDG) during the Earth’s Cenozoic era. Our exploratory simulations simultaneously produce multiple realistic biodiversity patterns, such as the LDG, current species richness, and range size frequencies, as well as phylogenetic metrics. The model engine is open source and available as an R package, enabling future exploration of various landscapes and biological processes, while outputs can be linked with a variety of empirical biodiversity patterns. This work represents a key toward a numeric, interdisciplinary, and mechanistic understanding of the physical and biological processes that shape Earth’s biodiversity.

2021 ◽  
Author(s):  
Oskar Hagen ◽  
Benjamin Flück ◽  
Fabian Fopp ◽  
Juliano S. Cabral ◽  
Florian Hartig ◽  
...  

AbstractUnderstanding the origins of biodiversity has been an aspiration since the days of early naturalists. The immense complexity of ecological, evolutionary and spatial processes, however, has made this goal elusive to this day. Computer models serve progress in many scientific fields, but in the fields of macroecology and macroevolution, eco-evolutionary models are comparatively less developed. We present a general, spatially-explicit, eco-evolutionary engine with a modular implementation that enables the modelling of multiple macroecological and macroevolutionary processes and feedbacks across representative spatio-temporally dynamic landscapes. Modelled processes can include environmental filtering, biotic interactions, dispersal, speciation and evolution of ecological traits. Commonly observed biodiversity patterns, such as α, β and γ diversity, species ranges, ecological traits and phylogenies, emerge as simulations proceed. As a case study, we examined alternative hypotheses expected to have shaped the latitudinal diversity gradient (LDG) during the Earth’s Cenozoic era. We found that a carrying capacity linked with energy was the only model variant that could simultaneously produce a realistic LDG, species range size frequencies, and phylogenetic tree balance. The model engine is open source and available as an R-package, enabling future exploration of various landscapes and biological processes, while outputs can be linked with a variety of empirical biodiversity patterns. This work represents a step towards a numeric and mechanistic understanding of the physical and biological processes that shape Earth’s biodiversity.


2022 ◽  
Author(s):  
Sebastian Hoehna ◽  
Bjoern Tore Kopperud ◽  
Andrew F Magee

Diversification rates inferred from phylogenies are not identifiable. There are infinitely many combinations of speciation and extinction rate functions that have the exact same likelihood score for a given phylogeny, building a congruence class. The specific shape and characteristics of such congruence classes have not yet been studied. Whether speciation and extinction rate functions within a congruence class share common features is also not known. Instead of striving to make the diversification rates identifiable, we can embrace their inherent non-identifiable nature. We use two different approaches to explore a congruence class: (i) testing of specific alternative hypotheses, and (ii) randomly sampling alternative rate function within the congruence class. Our methods are implemented in the open-source R package ACDC (https://github.com/afmagee/ACDC). ACDC provides a flexible approach to explore the congruence class and provides summaries of rate functions within a congruence class. The summaries can highlight common trends, i.e. increasing, flat or decreasing rates. Although there are infinitely many equally likely diversification rate functions, these can share common features. ACDC can be used to assess if diversification rate patterns are robust despite non-identifiability. In our example, we clearly identify three phases of diversification rate changes that are common among all models in the congruence class. Thus, congruence classes are not necessarily a problem for studying historical patterns of biodiversity from phylogenies.


2017 ◽  
Author(s):  
Jie Tan ◽  
Matthew Huyck ◽  
Dongbo Hu ◽  
René A. Zelaya ◽  
Deborah A. Hogan ◽  
...  

AbstractBackgroundGene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data.ResultsHere we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server (http://adage.greenelab.com) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and Δanr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr.ConclusionsWe designed ADAGE signature analysis to perform gene set analysis using data-defined functional gene signatures. This approach addresses an important gap for biologists studying non-traditional model organisms and those without extensive curated resources available. We built both an R package and web server to provide ADAGE signature analysis to the community.


2020 ◽  
Author(s):  
Benjamin G Freeman ◽  
Dolph Schluter ◽  
Joseph A Tobias

AbstractWhere is evolution fastest? The biotic interactions hypothesis proposes that greater species richness creates more ecological opportunity, driving faster evolution at low latitudes, whereas the “empty niches” hypothesis proposes that ecological opportunity is greater where diversity is low, spurring faster evolution at high latitudes. Here we tested these contrasting predictions by analyzing rates of bird beak evolution for a global dataset of 1141 sister pairs of birds. Beak size evolves at similar rates across latitudes, while beak shape evolves faster in the temperate zone, consistent with the empty niches hypothesis. We show in a meta-analysis that trait evolution and recent speciation rates are faster in the temperate zone, while rates of molecular evolution are slightly faster in the tropics. Our results suggest that drivers of evolutionary diversification are more potent at higher latitudes, thus calling into question multiple hypotheses invoking faster tropical evolution to explain the latitudinal diversity gradient.


2021 ◽  
Author(s):  
By Huan Chen ◽  
Brian Caffo ◽  
Genevieve Stein-O’Brien ◽  
Jinrui Liu ◽  
Ben Langmead ◽  
...  

SummaryIntegrative analysis of multiple data sets has the potential of fully leveraging the vast amount of high throughput biological data being generated. In particular such analysis will be powerful in making inference from publicly available collections of genetic, transcriptomic and epigenetic data sets which are designed to study shared biological processes, but which vary in their target measurements, biological variation, unwanted noise, and batch variation. Thus, methods that enable the joint analysis of multiple data sets are needed to gain insights into shared biological processes that would otherwise be hidden by unwanted intra-data set variation. Here, we propose a method called two-stage linked component analysis (2s-LCA) to jointly decompose multiple biologically related experimental data sets with biological and technological relationships that can be structured into the decomposition. The consistency of the proposed method is established and its empirical performance is evaluated via simulation studies. We apply 2s-LCA to jointly analyze four data sets focused on human brain development and identify meaningful patterns of gene expression in human neurogenesis that have shared structure across these data sets. The code to conduct 2s-LCA has been complied into an R package “PJD”, which is available at https://github.com/CHuanSite/PJD.


2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Julian Friedrich ◽  
Hans-Peter Hammes ◽  
Guido Krenning

Abstract microRNAs (miRNAs) regulate gene expression and thereby influence biological processes in health and disease. As a consequence, miRNAs are intensely studied and literature on miRNAs has been constantly growing. While this growing body of literature reflects the interest in miRNAs, it generates a challenge to maintain an overview, and the comparison of miRNAs that may function across diverse disease fields is complex due to this large number of relevant publications. To address these challenges, we designed miRetrieve, an R package and web application that provides an overview on miRNAs. By text mining, miRetrieve can characterize and compare miRNAs within specific disease fields and across disease areas. This overview provides focus and facilitates the generation of new hypotheses. Here, we explain how miRetrieve works and how it is used. Furthermore, we demonstrate its applicability in an exemplary case study and discuss its advantages and disadvantages.


2010 ◽  
Vol 278 (1713) ◽  
pp. 1777-1785 ◽  
Author(s):  
Sean P. Mullen ◽  
Wesley K. Savage ◽  
Niklas Wahlberg ◽  
Keith R. Willmott

Latitudinal gradients in species richness are among the most well-known biogeographic patterns in nature, and yet there remains much debate and little consensus over the ecological and evolutionary causes of these gradients. Here, we evaluated whether two prominent alternative hypotheses (namely differences in diversification rate or clade age) could account for the latitudinal diversity gradient in one of the most speciose neotropical butterfly genera ( Adelpha ) and its close relatives. We generated a multilocus phylogeny of a diverse group of butterflies in the containing tribe Limenitidini, which has both temperate and tropical representatives. Our results suggest there is no relationship between clade age and species richness that could account for the diversity gradient, but that instead it could be explained by a significantly higher diversification rate within the predominantly tropical genus Adelpha . An apparent early larval host-plant shift to Rubiaceae and other plant families suggests that the availability of new potential host plants probably contributed to an increase in diversification of Adelpha in the lowland Neotropics. Collectively, our results support the hypothesis that the equatorial peak in species richness observed within Adelpha is the result of increased diversification rate in the last 10–15 Myr rather than a function of clade age, perhaps reflecting adaptive divergence in response to the dramatic host-plant diversity found within neotropical ecosystems.


2017 ◽  
Author(s):  
Felix May ◽  
Katharina Gerstner ◽  
Dan McGlinn ◽  
Xiao Xiao ◽  
Jonathan M. Chase

Abstract1. Estimating biodiversity and its changes in space and time poses serious methodological challenges. First, there has been a long debate on how to quantify biodiversity, and second, measurements of biodiversity change are scale-dependent. Therefore comparisons of biodiversity metrics between communities are ideally carried out across scales. Simulation can be used to study the utility of biodiversity metrics across scales, but most approaches are system specific and plagued by large parameter spaces and therefore cumbersome to use and interpret. However, realistic spatial biodiversity patterns can be generated without reference to ecological processes, which suggests a simple simulation framework could provide an important tool for ecologists.2. Here, we present the R package mobsim that allows users to simulate the abundances and the spatial distribution of individuals of different species. Users can define key properties of communities, including the total numbers of individuals and species, the relative abundance distribution, and the degree of spatial aggregation. Furthermore, the package provides functions that derive biodiversity patterns from simulated communities, or from observed data, as well as functions that simulate different sampling designs.3. We show several example applications of the package. First, we illustrate how species rarefaction and accumulation curves can be used to disentangle changes in the fundamental biodiversity components: (i) total abundance, (ii) relative abundance distribution, (iii) and species aggregation. Second, we demonstrate how mobsim can be used to assess the performance of species-richness estimators. The latter indicates how spatial aggregation challenges classical non-spatial species-richness estimators.4. mobsim allows the simulation and analysis of a large range of biodiversity scenarios and sampling designs in an efficient and comprehensive way. The simplicity and control provided by the package can also make it a useful didactic tool. The combination of controlled simulations and their analysis will facilitate a more rigorous interpretation of real world data that exhibit sampling effects and scale-dependence.


2022 ◽  
Author(s):  
Sachin Muralidharan ◽  
Farah Zahir ◽  
Ahmed M. Mehdi

Aims/hypothesis: The purpose of this study is to manually and semi-automatically curate a database and develop an R package that will act as a comprehensive resource to understand how biological processes are dysregulated due to interactions with environmental factors. Methods: We followed a two-step process to achieve the objectives of this study. First, we conducted a systematic review of the existing gene expression datasets to identify the integrated genomic and environmental factors used in available studies. This enabled us to curate a comprehensive genomic-environmental database for four key environmental factors (smoking, diet, infections and toxic chemicals) associated with various autoimmune and chronic conditions. Second, we developed a statistical analysis package that allows users to understand the relationships between differentially expressed genes and environmental factors under different disease conditions. Results: The initial database search run on the Gene Expression Omnibus (GEO) and the Molecular Signature Database (MSigDB) retrieved a total of 90,018 articles. After title and abstract screening against pre-set criteria, a total of 186 studies were selected. From those, 243 individual sets of genes, or gene modules, were obtained. We then curated a database containing four environmental factors, namely cigarette smoking, diet, infections and toxic chemicals, along with a total of 25789 genes that had an association with one or more of these factors. In 6 case studies, the database and statistical analysis package were then tested with lists of differentially expressed genes obtained from the published literature related to type 1 diabetes, rheumatoid arthritis, small cell lung cancer, cobalt exposure, COVID-19 and smoking. On testing, we uncovered statistically enriched biological processes, which could help us understand the pathways associated with environmental factors and gene modules. Conclusions: A novel curated database and software tool is provided as an R Package. Users can enter a list of genes to discover associated environmental factors under various disease conditions.


Sign in / Sign up

Export Citation Format

Share Document