scholarly journals Harder, better, faster, stronger: Large-scale QM and QM/MM for predictive modeling in enzymes and proteins

2022 ◽  
Vol 72 ◽  
pp. 9-17
Author(s):  
Vyshnavi Vennelakanti ◽  
Azadeh Nazemi ◽  
Rimsha Mehmood ◽  
Adam H. Steeves ◽  
Heather J. Kulik
2020 ◽  
Author(s):  
Jacob Bien ◽  
Xiaohan Yan ◽  
Léo Simpson ◽  
Christian L. Müller

AbstractModern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven, parameter-free, and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest. The excess number of zero or low count measurements at the read level forces traditional microbiome data analysis workflows to remove rare sequencing variants or group them by a fixed taxonomic rank, such as genus or phylum, or by phylogenetic similarity. By contrast, our framework, which we call trac (tree-aggregation of compositional data), learns data-adaptive taxon aggregation levels for predictive modeling making user-defined aggregation obsolete while simultaneously integrating seamlessly into the compositional data analysis framework. We illustrate the versatility of our framework in the context of large-scale regression problems in human-gut, soil, and marine microbial ecosystems. We posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbial ecologists gain insights into the structure and functioning of the underlying ecosystem of interest.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jacob Bien ◽  
Xiaohan Yan ◽  
Léo Simpson ◽  
Christian L. Müller

AbstractModern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest. The excess number of zero or low count measurements at the read level forces traditional microbiome data analysis workflows to remove rare sequencing variants or group them by a fixed taxonomic rank, such as genus or phylum, or by phylogenetic similarity. By contrast, our framework, which we call  (ee-ggregation of ompositional data), learns data-adaptive taxon aggregation levels for predictive modeling, greatly reducing the need for user-defined aggregation in preprocessing while simultaneously integrating seamlessly into the compositional data analysis framework. We illustrate the versatility of our framework in the context of large-scale regression problems in human gut, soil, and marine microbial ecosystems. We posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbiome researchers gain insights into the structure and functioning of the underlying ecosystem of interest.


2018 ◽  
Author(s):  
Fiona Davidson

Knowledge of deep-sea species and their ecosystems is limited due to the inaccessibility of the areas and the prohibitive cost of conducting large-scale field studies. My graduate research has used predictive modeling methods to map hexactinellid sponge habitat extent in the North Pacific, as well as climate-induced changes in oceanic dissolved oxygen levels and how this will impact sponges. Results from a MaxEnt model based on sponge presence data from the eastern Pacific, in conjunction with bathymetric terrain derivatives, closely mapped existing sponge habitats, and suggested a depth threshold around 3000 meters below which sponges are not found. Early results suggest that oxygen is another important predictor of sponge habitat, including this and a variety of other environmental predictors (e.g. based on ocean chemistry, physics and biology) and different model scales would improve model accuracy. The long-term goal of this research is to apply climate prediction data to the predictive modeling in order to assess the sensitivity of deep-sea sponge habitat to global climate changes.


2018 ◽  
Author(s):  
Fiona Davidson

Knowledge of deep-sea species and their ecosystems is limited due to the inaccessibility of the areas and the prohibitive cost of conducting large-scale field studies. My graduate research has used predictive modeling methods to map hexactinellid sponge habitat extent in the North Pacific, as well as climate-induced changes in oceanic dissolved oxygen levels and how this will impact sponges. Results from a MaxEnt model based on sponge presence data from the eastern Pacific, in conjunction with bathymetric terrain derivatives, closely mapped existing sponge habitats, and suggested a depth threshold around 3000 meters below which sponges are not found. Early results suggest that oxygen is another important predictor of sponge habitat, including this and a variety of other environmental predictors (e.g. based on ocean chemistry, physics and biology) and different model scales would improve model accuracy. The long-term goal of this research is to apply climate prediction data to the predictive modeling in order to assess the sensitivity of deep-sea sponge habitat to global climate changes.


Sign in / Sign up

Export Citation Format

Share Document