scholarly journals A call for virtual experiments: accelerating the scientific process

Author(s):  
Jonathan Cooper ◽  
Jon Olav Vik ◽  
Dagmar Waltemath

Experimentation is fundamental to the scientific method, whether for exploration, description or explanation. In the exploration of a novel system, children and researchers alike will mess about with things just to see what happens. More formalized experimental protocols ensure reproducible results and form a basis for comparing systems in terms of their response to a specific stimulus. Finally, experiments can be carefully designed to distinguish between competing causal hypotheses based on their different testable predictions about the outcome of the experimental manipulation. One would therefore expect experiments to be central in computational biology too. Indeed, a mathematical model embodies a thought experiment, a causal hypothesis, and its falsifiable predictions. It is easy to ask "what if" we were to change a parameter, an initial state, or the model structure. Papers in computational biology focus on describing and analyzing the effects of such changes, and on confronting models with experimental data. This confrontation often generates new hypotheses, and many if not most new models arise by modification of existing ones. However, most virtual experiments are not built to be reproducible, and thus die with the paper they are published in. This inhibits the critical scrutiny of models, as models are seldom subjected to the same simulation experiments as their predecessors, or revisited later in the light of new data. Perhaps worse, the status quo fails to take full advantage of experiments as a common language between modellers and experimentalists. Despite the growing availability of data and model repositories, there has been only a slow uptake of emerging tools and standards for documenting and sharing the protocols for simulation experiments and their results. We argue that promoting the reuse of virtual experiments would vastly improve the usefulness and relevance of computational models, including in biomedical endeavours such as the Virtual Physiological Human and the Human Brain Project. We review the benefits of reusable virtual experiments: in specifying, assaying, and comparing the behavioural repertoires of models; as prerequisites for reproducible research; to guide model reuse and composition; and for quality assurance in the application of computational biology models. Next, we discuss potential approaches for implementing virtual experiments, arguing that models and experimental protocols should be represented separately, but annotated so as to facilitate the linking of models to experiments and data. We follow with some consideration of open questions and challenges that remain before the use of virtual experiments can become widespread. Lastly, we outline a vision for how the rigorous, streamlined confrontation between experimental datasets and candidate models would enable a "continuous integration" of biological knowledge, akin to the strategy used in software development.

2014 ◽  
Author(s):  
Jonathan Cooper ◽  
Jon Olav Vik ◽  
Dagmar Waltemath

Experimentation is fundamental to the scientific method, whether for exploration, description or explanation. In the exploration of a novel system, children and researchers alike will mess about with things just to see what happens. More formalized experimental protocols ensure reproducible results and form a basis for comparing systems in terms of their response to a specific stimulus. Finally, experiments can be carefully designed to distinguish between competing causal hypotheses based on their different testable predictions about the outcome of the experimental manipulation. One would therefore expect experiments to be central in computational biology too. Indeed, a mathematical model embodies a thought experiment, a causal hypothesis, and its falsifiable predictions. It is easy to ask "what if" we were to change a parameter, an initial state, or the model structure. Papers in computational biology focus on describing and analyzing the effects of such changes, and on confronting models with experimental data. This confrontation often generates new hypotheses, and many if not most new models arise by modification of existing ones. However, most virtual experiments are not built to be reproducible, and thus die with the paper they are published in. This inhibits the critical scrutiny of models, as models are seldom subjected to the same simulation experiments as their predecessors, or revisited later in the light of new data. Perhaps worse, the status quo fails to take full advantage of experiments as a common language between modellers and experimentalists. Despite the growing availability of data and model repositories, there has been only a slow uptake of emerging tools and standards for documenting and sharing the protocols for simulation experiments and their results. We argue that promoting the reuse of virtual experiments would vastly improve the usefulness and relevance of computational models, including in biomedical endeavours such as the Virtual Physiological Human and the Human Brain Project. We review the benefits of reusable virtual experiments: in specifying, assaying, and comparing the behavioural repertoires of models; as prerequisites for reproducible research; to guide model reuse and composition; and for quality assurance in the application of computational biology models. Next, we discuss potential approaches for implementing virtual experiments, arguing that models and experimental protocols should be represented separately, but annotated so as to facilitate the linking of models to experiments and data. We follow with some consideration of open questions and challenges that remain before the use of virtual experiments can become widespread. Lastly, we outline a vision for how the rigorous, streamlined confrontation between experimental datasets and candidate models would enable a "continuous integration" of biological knowledge, akin to the strategy used in software development.


2018 ◽  
Author(s):  
Danilo Bzdok ◽  
Denis Engemann ◽  
Olivier Grisel ◽  
Gaël Varoquaux ◽  
Bertrand Thirion

AbstractIn the 20th century many advances in biological knowledge and evidence-based medicine were supported by p-values and accompanying methods. In the beginning 21st century, ambitions towards precision medicine put a premium on detailed predictions for single individuals. The shift causes tension between traditional methods used to infer statistically significant group differences and burgeoning machine-learning tools suited to forecast an individual’s future. This comparison applies the linear model for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how statistical inference and pattern recognition can agree and diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships. However, even statistically strong findings with very low p-values shed little light on their value for achieving accurate prediction in the same dataset. More complete understanding of different ways to define ‘important’ associations is a prerequisite for reproducible research findings that can serve to personalize clinical care.


mSystems ◽  
2019 ◽  
Vol 4 (2) ◽  
Author(s):  
Meghan Thommes ◽  
Taiyao Wang ◽  
Qi Zhao ◽  
Ioannis C. Paschalidis ◽  
Daniel Segrè

ABSTRACTMicrobes face a trade-off between being metabolically independent and relying on neighboring organisms for the supply of some essential metabolites. This balance of conflicting strategies affects microbial community structure and dynamics, with important implications for microbiome research and synthetic ecology. A “gedanken” (thought) experiment to investigate this trade-off would involve monitoring the rise of mutual dependence as the number of metabolic reactions allowed in an organism is increasingly constrained. The expectation is that below a certain number of reactions, no individual organism would be able to grow in isolation and cross-feeding partnerships and division of labor would emerge. We implemented this idealized experiment usingin silicogenome-scale models. In particular, we used mixed-integer linear programming to identify trade-off solutions in communities ofEscherichia colistrains. The strategies that we found revealed a large space of opportunities in nuanced and nonintuitive metabolic division of labor, including, for example, splitting the tricarboxylic acid (TCA) cycle into two separate halves. The systematic computation of possible solutions in division of labor for 1-, 2-, and 3-strain consortia resulted in a rich and complex landscape. This landscape displayed a nonlinear boundary, indicating that the loss of an intracellular reaction was not necessarily compensated for by a single imported metabolite. Different regions in this landscape were associated with specific solutions and patterns of exchanged metabolites. Our approach also predicts the existence of regions in this landscape where independent bacteria are viable but are outcompeted by cross-feeding pairs, providing a possible incentive for the rise of division of labor.IMPORTANCEUnderstanding how microbes assemble into communities is a fundamental open issue in biology, relevant to human health, metabolic engineering, and environmental sustainability. A possible mechanism for interactions of microbes is through cross-feeding, i.e., the exchange of small molecules. These metabolic exchanges may allow different microbes to specialize in distinct tasks and evolve division of labor. To systematically explore the space of possible strategies for division of labor, we applied advanced optimization algorithms to computational models of cellular metabolism. Specifically, we searched for communities able to survive under constraints (such as a limited number of reactions) that would not be sustainable by individual species. We found that predicted consortia partition metabolic pathways in ways that would be difficult to identify manually, possibly providing a competitive advantage over individual organisms. In addition to helping understand diversity in natural microbial communities, our approach could assist in the design of synthetic consortia.


2004 ◽  
Vol 5 (1) ◽  
pp. 100-104 ◽  
Author(s):  
C. Vlachos ◽  
R. Gregory ◽  
R. C. Paton ◽  
J. R. Saunders ◽  
Q. H. Wu

This paper presents two approaches to the individual-based modelling of bacterial ecologies and evolution using computational tools. The first approach is a fine-grained model that is based on networks of interactivity between computational objects representing genes and proteins. The second approach is a coarser-grained, agent-based model, which is designed to explore the evolvability of adaptive behavioural strategies in artificial bacteria represented by learning classifier systems. The structure and implementation of these computational models is discussed, and some results from simulation experiments are presented. Finally, the potential applications of the proposed models to the solution of real-world computational problems, and their use in improving our understanding of the mechanisms of evolution, are briefly outlined.


2020 ◽  
Vol 36 (15) ◽  
pp. 4301-4308
Author(s):  
Stephan Seifert ◽  
Sven Gundlach ◽  
Olaf Junge ◽  
Silke Szymczak

Abstract Motivation High-throughput technologies allow comprehensive characterization of individuals on many molecular levels. However, training computational models to predict disease status based on omics data is challenging. A promising solution is the integration of external knowledge about structural and functional relationships into the modeling process. We compared four published random forest-based approaches using two simulation studies and nine experimental datasets. Results The self-sufficient prediction error approach should be applied when large numbers of relevant pathways are expected. The competing methods hunting and learner of functional enrichment should be used when low numbers of relevant pathways are expected or the most strongly associated pathways are of interest. The hybrid approach synthetic features is not recommended because of its high false discovery rate. Availability and implementation An R package providing functions for data analysis and simulation is available at GitHub (https://github.com/szymczak-lab/PathwayGuidedRF). An accompanying R data package (https://github.com/szymczak-lab/DataPathwayGuidedRF) stores the processed and quality controlled experimental datasets downloaded from Gene Expression Omnibus (GEO). Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 3 (3) ◽  
pp. 16-35
Author(s):  
Timothy Jay Carney

People in a variety of settings can be heard uttering the phrase that “knowledge is power” or the relatively equivalent concept that “information is power.” However, the research literature in particular lacks a simple and standardized way to examine the relationship between knowledge and power. There is a lack operational quantitative definitions of this relationship to adequately support the building of complex computational models used in addressing some longstanding public health and healthcare delivery issues like differential access to care, inequitable care and treatment, institutional bias, disparities in health outcomes, and eliminating barriers to patient-centered care. The objective of this discussion is to present a relational algorithm that can be used in both conceptual discussions on knowledge empowerment modeling, as well as in the building of computational models that want to explore the variable of knowledge empowerment within computer simulation experiments.


2014 ◽  
Author(s):  
Martin Scharm ◽  
Florian Wendland ◽  
Martin Peters ◽  
Markus Wolfien ◽  
Tom Theile ◽  
...  

Sharing in silico experiments is essential for the advance of research in computational biology. Consequently, the COMBINE archive was designed as a digital container format. It eases the management of numerous files, fosters collaboration, and ultimately enables the exchange of reproducible research results. However, manual handling of COMBINE archives is tedious and error prone. We therefore developed the CombineArchive Toolkit. It supports scientists in promoting and publishing their work by means of creating, exploring, modifying, and sharing archives.


2012 ◽  
Vol 9 (73) ◽  
pp. 1846-1855 ◽  
Author(s):  
P. S. L. Anderson ◽  
E. J. Rayfield

Computational models such as finite-element analysis offer biologists a means of exploring the structural mechanics of biological systems that cannot be directly observed. Validated against experimental data, a model can be manipulated to perform virtual experiments, testing variables that are hard to control in physical experiments. The relationship between tooth form and the ability to break down prey is key to understanding the evolution of dentition. Recent experimental work has quantified how tooth shape promotes fracture in biological materials. We present a validated finite-element model derived from physical compression experiments. The model shows close agreement with strain patterns observed in photoelastic test materials and reaction forces measured during these experiments. We use the model to measure strain energy within the test material when different tooth shapes are used. Results show that notched blades deform materials for less strain energy cost than straight blades, giving insights into the energetic relationship between tooth form and prey materials. We identify a hypothetical ‘optimal’ blade angle that minimizes strain energy costs and test alternative prey materials via virtual experiments. Using experimental data and computational models offers an integrative approach to understand the mechanics of tooth morphology.


2012 ◽  
Vol 27 (2) ◽  
pp. 221-238 ◽  
Author(s):  
Allen Wilhite ◽  
Eric A. Fong

AbstractHypothesis testing is uncommon in agent-based modeling and there are many reasons why (see Fagiolo et al. (2007) for a review). This is one of those uncommon studies: a combination of the new and old. First, a traditional neoclassical model of decision making is broadened by introducing agents who interact in an organization. The resulting computational model is analyzed using virtual experiments to consider how different organizational structures (different network topologies) affect the evolutionary path of an organization's corporate culture. These computational experiments establish testable hypotheses concerning structure, culture, and performance, and those hypotheses are tested empirically using data from an international sample of firms. In addition to learning something about organizational structure and innovation, the paper demonstrates how computational models can be used to frame empirical investigations and facilitate the interpretation of results in a traditional fashion.


Sign in / Sign up

Export Citation Format

Share Document