Prediction of vascular aging based on smartphone acquired PPG signals

AbstractPhotoplethysmography (PPG) measured by smartphone has the potential for a large scale, non-invasive, and easy-to-use screening tool. Vascular aging is linked to increased arterial stiffness, which can be measured by PPG. We investigate the feasibility of using PPG to predict healthy vascular aging (HVA) based on two approaches: machine learning (ML) and deep learning (DL). We performed data preprocessing, including detrending, demodulating, and denoising on the raw PPG signals. For ML, ridge penalized regression has been applied to 38 features extracted from PPG, whereas for DL several convolutional neural networks (CNNs) have been applied to the whole PPG signals as input. The analysis has been conducted using the crowd-sourced Heart for Heart data. The prediction performance of ML using two features (AUC of 94.7%) – the a wave of the second derivative PPG and tpr, including four covariates, sex, height, weight, and smoking – was similar to that of the best performing CNN, 12-layer ResNet (AUC of 95.3%). Without having the heavy computational cost of DL, ML might be advantageous in finding potential biomarkers for HVA prediction. The whole workflow of the procedure is clearly described, and open software has been made available to facilitate replication of the results.

Download Full-text

Prediction of vascular aging based on smartphone acquired PPG signals

10.1101/2020.05.26.116186 ◽

2020 ◽

Author(s):

Lorenzo Dall’Olio ◽

Nico Curti ◽

Daniel Remondini ◽

Yosef Safi Harb ◽

Folkert W. Asselbergs ◽

...

Keyword(s):

Machine Learning ◽

Large Scale ◽

Screening Tool ◽

Computational Cost ◽

Penalized Regression ◽

Prediction Performance ◽

Second Derivative ◽

Vascular Aging ◽

Non Invasive ◽

Potential Biomarkers

ABSTRACTPhotoplethysmography (PPG) measured by smartphone has the potential for a large scale, non-invasive, and easy-to-use screening tool. Vascular aging is linked to increased arterial stiffness, which can be measured by PPG. We investigate the feasibility of using PPG to predict healthy vascular aging (HVA) based on two approaches: machine learning (ML) and deep learning (DL). We performed data preprocessing including detrending, demodulating and denoising on the raw PPG signals. For ML, ridge penalized regression has been applied to 38 features extracted from PPG, whereas for DL several convolutional neural networks (CNNs) have been applied to the whole PPG signals as input. The analysis has been conducted using the crowd-sourced Heart for Heart data. The prediction performance of ML using two features (AUC of 94.7%) – the a wave of the second derivative PPG and tpr, including four covariates, sex, height, weight, and smoking – was similar to that of the best performing CNN, 12-layer ResNet (AUC of 95.3%). Without having the heavy computational cost of DL, ML might be advantageous in finding potential biomarkers for HVA prediction. The whole workflow of the procedure is clearly described, and open software has been made available to facilitate replication of the results.

Download Full-text

Δ-Quantum machine learning for medicinal chemistry

10.33774/chemrxiv-2021-fz6v7-v2 ◽

2021 ◽

Author(s):

Kenneth Atz ◽

Clemens Isert ◽

Markus N. A. Böcker ◽

José Jiménez-Luna ◽

Gisbert Schneider

Keyword(s):

Machine Learning ◽

Message Passing ◽

Density Functional ◽

Large Scale ◽

Molecular Design ◽

Low Cost ◽

Computational Cost ◽

Three Dimensional ◽

Biomolecular Systems ◽

Quantum Observables

Many molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. However, the computational cost of QM methods applied to drug-like molecules currently renders large-scale applications of quantum chemistry challenging. Aiming to mitigate this problem, we developed DelFTa, an open-source toolbox for the prediction of electronic properties of drug-like molecules at the density functional (DFT) level of theory, using Δ-machine-learning. Δ-Learning corrects the prediction error (Δ) of a fast but inaccurate property calculation. DelFTa employs state-of-the-art three-dimensional message-passing neural networks trained on a large dataset of QM properties. It provides access to a wide array of quantum observables on the molecular, atomic and bond levels by predicting approximations to DFT values from a low-cost semiempirical baseline. Δ-Learning outperformed its direct-learning counterpart for most of the considered QM endpoints. The results suggest that predictions for non-covalent intra- and intermolecular interactions can be extrapolated to larger biomolecular systems. The software is fully open-sourced and features documented command-line and Python APIs.

Download Full-text

Open-source Δ-quantum machine learning for medicinal chemistry

10.33774/chemrxiv-2021-fz6v7 ◽

2021 ◽

Author(s):

Kenneth Atz ◽

Clemens Isert ◽

Markus N. A. Böcker ◽

José Jiménez-Luna ◽

Gisbert Schneider

Keyword(s):

Machine Learning ◽

Open Source ◽

Density Functional ◽

Large Scale ◽

Molecular Design ◽

State Of The Art ◽

Computational Cost ◽

Quantum Mechanical ◽

Quantum Observables ◽

Graph Neural Networks

Certain molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. However, the computational cost of QM methods applied to drug-like compounds currently makes large-scale applications of quantum chemistry challenging. In order to mitigate this problem, we developed DelFTa, an open-source toolbox for predicting small-molecule electronic properties at the density functional (DFT) level of theory, using the Δ-machine learning principle. DelFTa employs state-of-the-art E(3)-equivariant graph neural networks that were trained on the QMugs dataset of QM properties. It provides access to a wide array of quantum observables by predicting approximations to ωB97X-D/def2-SVP values from a GFN2-xTB semiempirical baseline. Δ-learning with DelFTa was shown to outperform direct DFT learning for most of the considered QM endpoints. The software is provided as open-source code with fully-documented command-line and Python APIs.

Download Full-text

Designing production-friendly machine learning

Proceedings of the VLDB Endowment ◽

10.14778/3484224.3484241 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3420-3420

Author(s):

Matei Zaharia

Keyword(s):

Machine Learning ◽

Open Source ◽

Large Scale ◽

Question Answering ◽

Failure Modes ◽

Computational Cost ◽

Language Models ◽

Software Systems ◽

Resource Cost ◽

Low Computational Cost

Building production ML applications is difficult because of their resource cost and complex failure modes. I will discuss these challenges from two perspectives: the Stanford DAWN Lab and experience with large-scale commercial ML users at Databricks. I will then present two emerging ideas to help address these challenges. The first is "ML platforms", an emerging class of software systems that standardize the interfaces used in ML applications to make them easier to build and maintain. I will give a few examples, including the open-source MLflow system from Databricks [3]. The second idea is models that are more "production-friendly" by design. As a concrete example, I will discuss retrieval-based NLP models such as Stanford's ColBERT [1, 2] that query documents from an updateable corpus to perform tasks such as question-answering, which gives multiple practical advantages, including low computational cost, high interpretability, and very fast updates to the model's "knowledge". These models are an exciting alternative to large language models such as GPT-3.

Download Full-text

An extended comparison study of large scale datadriven prediction methods based on variable selection, latent variables, penalized regression and machine learning

Computer Aided Chemical Engineering - 26th European Symposium on Computer Aided Process Engineering ◽

10.1016/b978-0-444-63428-3.50276-9 ◽

2016 ◽

pp. 1629-1634 ◽

Cited By ~ 2

Author(s):

Ricardo Rendall ◽

Ana Pereira ◽

Marco Reis

Keyword(s):

Machine Learning ◽

Variable Selection ◽

Latent Variables ◽

Large Scale ◽

Penalized Regression ◽

Prediction Methods ◽

Comparison Study

Download Full-text

Machine learning outperforms existing non-invasive tests as screening tool for liver fibrosis in the general population

Journal of Hepatology ◽

10.1016/s0168-8278(20)30678-4 ◽

2020 ◽

Vol 73 ◽

pp. S72

Author(s):

Miquel Serra-Burriel ◽

Isabel Graupera ◽

Maja Thiele ◽

Llorenç Caballeria ◽

Dominique Roulot ◽

...

Keyword(s):

Machine Learning ◽

Liver Fibrosis ◽

General Population ◽

Screening Tool ◽

Non Invasive

Download Full-text

Accelerated crystal structure prediction of multi-elements random alloy using expandable features

Scientific Reports ◽

10.1038/s41598-021-84544-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Taewon Jin ◽

Ina Park ◽

Taesu Park ◽

Jaesik Park ◽

Ji Hoon Shim

Keyword(s):

Crystal Structure ◽

Machine Learning ◽

Structure Prediction ◽

Large Scale ◽

Computational Cost ◽

Functional Materials ◽

Training Dataset ◽

High Entropy Alloy ◽

Crystal Structure Prediction ◽

Data Set

AbstractProperties of solid-state materials depend on their crystal structures. In solid solution high entropy alloy (HEA), its mechanical properties such as strength and ductility depend on its phase. Therefore, the crystal structure prediction should be preceded to find new functional materials. Recently, the machine learning-based approach has been successfully applied to the prediction of structural phases. However, since about 80% of the data set is used as a training set in machine learning, it is well known that it requires vast cost for preparing a dataset of multi-element alloy as training. In this work, we develop an efficient approach to predicting the multi-element alloys' structural phases without preparing a large scale of the training dataset. We demonstrate that our method trained from binary alloy dataset can be applied to the multi-element alloys' crystal structure prediction by designing a transformation module from raw features to expandable form. Surprisingly, without involving the multi-element alloys in the training process, we obtain an accuracy, 80.56% for the phase of the multi-element alloy and 84.20% accuracy for the phase of HEA. It is comparable with the previous machine learning results. Besides, our approach saves at least three orders of magnitude computational cost for HEA by employing expandable features. We suggest that this accelerated approach can be applied to predicting various structural properties of multi-elements alloys that do not exist in the current structural database.

Download Full-text

A mixed quantum chemistry/machine learning approach for the fast and accurate prediction of biochemical redox potentials and its large-scale application to 315,000 redox reactions

10.1101/245357 ◽

2018 ◽

Author(s):

Adrian Jinich ◽

Benjamin Sanchez-Lengeling ◽

Haniu Ren ◽

Rebecca Harman ◽

Alán Aspuru-Guzik

Keyword(s):

Machine Learning ◽

Quantum Chemistry ◽

Redox Reactions ◽

Large Scale ◽

Computational Cost ◽

Accurate Method ◽

Redox Potentials ◽

Quantum Chemistry Calculations ◽

Quantitative Understanding ◽

Semi Empirical

AbstractA quantitative understanding of the thermodynamics of biochemical reactions is essential for accurately modeling metabolism. The group contribution method (GCM) is one of the most widely used approaches to estimating standard Gibbs energies and redox potentials of reactions for which no experimental measurements exist. Previous work has shown that quantum chemical predictions of biochemical thermodynamics are a promising approach to overcome the limitations of GCM. However, the quantum chemistry approach is significantly more expensive. Here we use a combination of quantum chemistry and machine learning to obtain a fast and accurate method for predicting the thermodynamics of biochemical redox reactions. We focus on predicting the redox potentials of carbonyl functional group reductions to alcohols and amines, two of the most ubiquitous carbon redox transformations in biology. Our method relies on semi-empirical quantum chemistry calculations calibrated with Gaussian Process (GP) regression against available experimental data. Our approach results in higher predictive power than the GCM at a low computational cost. We design and implement a network expansion algorithm that iteratively reduces and oxidizes a set of natural seed metabolites, and demonstrate the high-throughput applicability of our method by predicting the standard potentials of more than 315,000 redox reactions involving approximately 70,000 compounds. Additionally, we developed a novel fingerprint-based framework for detecting molecular environment motifs that are enriched or depleted across different regions of the redox potential landscape. We provide open access to all source code and data generated.

Download Full-text

Stochastic sub-sampled Newton method with variance reduction

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319500413 ◽

2019 ◽

Vol 17 (06) ◽

pp. 1950041

Author(s):

Zhijian Luo ◽

Yuntao Qian

Keyword(s):

Machine Learning ◽

Newton Method ◽

Large Scale ◽

Variance Reduction ◽

Linear Time ◽

Computational Cost ◽

Gradient Methods ◽

Nonlinear Problems ◽

Learning Problems ◽

Reduced Gradient

Stochastic optimization on large-scale machine learning problems has been developed dramatically since stochastic gradient methods with variance reduction technique were introduced. Several stochastic second-order methods, which approximate curvature information by the Hessian in stochastic setting, have been proposed for improvements. In this paper, we introduce a Stochastic Sub-Sampled Newton method with Variance Reduction (S2NMVR), which incorporates the sub-sampled Newton method and stochastic variance-reduced gradient. For many machine learning problems, the linear time Hessian-vector production provides evidence to the computational efficiency of S2NMVR. We then develop two variations of S2NMVR that preserve the estimation of Hessian inverse and decrease the computational cost of Hessian-vector product for nonlinear problems.

Download Full-text

Large-Scale Data Learning Method for Anomaly Detection using Machine Learning for Monitoring Vibration in Vehicle Equipment

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.140.480 ◽

2020 ◽

Vol 140 (6) ◽

pp. 480-487

Author(s):

Minoru Kondo

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Large Scale ◽

Learning Method ◽

Large Scale Data ◽

Scale Data

Download Full-text