Objective discovery of dominant dynamical processes with machine learning

Abstract Significant advances in the understanding and modeling of dynamical systems has been enabled by the identification of processes that locally and approximately dominate system behavior, or dynamical regimes. The conventional regime identification method involves tedious and ad hoc parsing of data to judiciously obtain scales to ascertain which governing equation terms are dominant in each regime. Surprisingly, no objective and universally applicable criterion exists to robustly identify dynamical regimes in an unbiased manner, neither for conventional nor for machine learning-based methods of analysis. Here, we formally define dynamical regime identification as an optimization problem by using a verification criterion, and we show that an unsupervised learning framework can automatically and credibly identify regimes. This eliminates reliance upon ad hoc conventional analyses, with vast potential to accelerate discovery. Our verification criterion also enables unbiased comparison of regimes identified by different methods. In addition to diagnostic applications, the verification criterion and learning framework are immediately useful for data-driven dynamical process modeling, and are relevant to researchers interested in the development of inherently interpretable methods for scientific machine learning. Automation of this kind of approximate mechanistic analysis is necessary for scientists to gain new dynamical insights from increasingly large data streams.

Download Full-text

KDML: a machine-learning framework for inference of multi-scale gene functions from genetic perturbation screens

10.1101/761106 ◽

2019 ◽

Cited By ~ 1

Author(s):

Heba Z. Sailem ◽

Jens Rittscher ◽

Lucas Pelkmans

Keyword(s):

Colorectal Cancer ◽

Machine Learning ◽

Large Scale ◽

Ad Hoc ◽

Olfactory Receptors ◽

Functional Enrichment ◽

Learning Framework ◽

Gene Functions ◽

Health And Disease ◽

Colorectal Cancer Patients

AbstractCharacterising context-dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large-scale genetic perturbation screens is based on ad-hoc analysis pipelines involving unsupervised clustering and functional enrichment. We present Knowledge-Driven Machine Learning (KDML), a framework that systematically predicts multiple functions for a given gene based on the similarity of its perturbation phenotype to those with known function. As proof of concept, we test KDML on three datasets describing phenotypes at the molecular, cellular and population levels, and show that it outperforms traditional analysis pipelines. In particular, KDML identified an abnormal multicellular organisation phenotype associated with the depletion of olfactory receptors and TGFβ and WNT signalling genes in colorectal cancer cells. We validate these predictions in colorectal cancer patients and show that olfactory receptors expression is predictive of worse patient outcome. These results highlight KDML as a systematic framework for discovering novel scale-crossing and clinically relevant gene functions. KDML is highly generalizable and applicable to various large-scale genetic perturbation screens.

Download Full-text

In silico Prediction of Inhibitory Constant of Thrombin Inhibitors Using Machine Learning

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666181220130232 ◽

2019 ◽

Vol 21 (9) ◽

pp. 662-669 ◽

Cited By ~ 1

Author(s):

Junnan Zhao ◽

Lu Zhu ◽

Weineng Zhou ◽

Lingfeng Yin ◽

Yuchen Wang ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Regression Tree ◽

Large Data ◽

Thrombin Inhibitors ◽

Coagulation Cascade ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Descriptor Selection

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.

Download Full-text

Hexagonal Image Processing in the Context of Machine Learning: Conception of a Biologically Inspired Hexagonal Deep Learning Framework

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) ◽

10.1109/icmla.2019.00300 ◽

2019 ◽

Cited By ~ 1

Author(s):

Tobias Schlosser ◽

Michael Friedrich ◽

Danny Kowerko

Keyword(s):

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

Biologically Inspired ◽

Learning Framework ◽

Learning Conception ◽

Hexagonal Image Processing

Download Full-text

Non-Intrusive Parametric Model Order Reduction with Error Correction Modeling for Changing Well Locations Using a Machine Learning Framework

10.2118/199042-ms ◽

2020 ◽

Author(s):

Hardikkumar Zalavadia ◽

Eduardo Gildin

Keyword(s):

Machine Learning ◽

Error Correction ◽

Model Order Reduction ◽

Order Reduction ◽

Parametric Model ◽

Model Order ◽

Parametric Model Order Reduction ◽

Learning Framework ◽

Error Correction Modeling

Download Full-text

SCOUR: a stepwise machine learning framework for predicting metabolite-dependent regulatory interactions

BMC Bioinformatics ◽

10.1186/s12859-021-04281-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Justin Y. Lee ◽

Britney Nguyen ◽

Carlos Orosco ◽

Mark P. Styczynski

Keyword(s):

Machine Learning ◽

Metabolic Networks ◽

Sampling Frequency ◽

Low Noise ◽

Training Data ◽

High Noise ◽

Regulatory Interactions ◽

Learning Framework ◽

Metabolic Systems ◽

Noise Data

Abstract Background The topology of metabolic networks is both well-studied and remarkably well-conserved across many species. The regulation of these networks, however, is much more poorly characterized, though it is known to be divergent across organisms—two characteristics that make it difficult to model metabolic networks accurately. While many computational methods have been built to unravel transcriptional regulation, there have been few approaches developed for systems-scale analysis and study of metabolic regulation. Here, we present a stepwise machine learning framework that applies established algorithms to identify regulatory interactions in metabolic systems based on metabolic data: stepwise classification of unknown regulation, or SCOUR. Results We evaluated our framework on both noiseless and noisy data, using several models of varying sizes and topologies to show that our approach is generalizable. We found that, when testing on data under the most realistic conditions (low sampling frequency and high noise), SCOUR could identify reaction fluxes controlled only by the concentration of a single metabolite (its primary substrate) with high accuracy. The positive predictive value (PPV) for identifying reactions controlled by the concentration of two metabolites ranged from 32 to 88% for noiseless data, 9.2 to 49% for either low sampling frequency/low noise or high sampling frequency/high noise data, and 6.6–27% for low sampling frequency/high noise data, with results typically sufficiently high for lab validation to be a practical endeavor. While the PPVs for reactions controlled by three metabolites were lower, they were still in most cases significantly better than random classification. Conclusions SCOUR uses a novel approach to synthetically generate the training data needed to identify regulators of reaction fluxes in a given metabolic system, enabling metabolomics and fluxomics data to be leveraged for regulatory structure inference. By identifying and triaging the most likely candidate regulatory interactions, SCOUR can drastically reduce the amount of time needed to identify and experimentally validate metabolic regulatory interactions. As high-throughput experimental methods for testing these interactions are further developed, SCOUR will provide critical impact in the development of predictive metabolic models in new organisms and pathways.

Download Full-text

An Efficient Machine Learning Framework for Stress Prediction via Sensor Integrated Keyboard Data

IEEE Access ◽

10.1109/access.2021.3094334 ◽

2021 ◽

pp. 1-1

Author(s):

P.B. Pankajavalli ◽

G.S. Karthick ◽

R. Sakthivel

Keyword(s):

Machine Learning ◽

Learning Framework ◽

Stress Prediction ◽

Efficient Machine

Download Full-text

A digital-twin and machine-learning framework for the design of multiobjective agrophotovoltaic solar farms

Computational Mechanics ◽

10.1007/s00466-021-02035-z ◽

2021 ◽

Author(s):

T. I. Zohdi

Keyword(s):

Machine Learning ◽

Digital Twin ◽

Learning Framework

Download Full-text

Simulation of sports movement training based on machine learning and brain-computer interface

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189481 ◽

2020 ◽

pp. 1-12

Author(s):

Linuo Wang

Keyword(s):

Machine Learning ◽

Time Series ◽

Joint Learning ◽

Scientific Methods ◽

Learning Framework ◽

Brain Functions ◽

Movement Training ◽

Practical Effect ◽

Machine Interface ◽

The Brain

Injuries and hidden dangers in training have a greater impact on athletes ’careers. In particular, the brain function that controls the motor function area has a greater impact on the athlete ’s competitive ability. Based on this, it is necessary to adopt scientific methods to recognize brain functions. In this paper, we study the structure of motor brain-computer and improve it based on traditional methods. Moreover, supported by machine learning and SVM technology, this study uses a DSP filter to convert the preprocessed EEG signal X into a time series, and adjusts the distance between the time series to classify the data. In order to solve the inconsistency of DSP algorithms, a multi-layer joint learning framework based on logistic regression model is proposed, and a brain-machine interface system of sports based on machine learning and SVM is constructed. In addition, this study designed a control experiment to improve the performance of the method proposed by this study. The research results show that the method in this paper has a certain practical effect and can be applied to sports.

Download Full-text