Identifying homogeneous subgroups of patients and important features: a topological machine learning approach

Abstract Background This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. Results We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. Conclusions Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types. Our pipeline can be downloaded under the GNU GPLv3 license at https://github.com/kcl-bhi/mapper-pipeline.

Download Full-text

Deep Learning in Disease Diagnosis: Models and Datasets

Current Bioinformatics ◽

10.2174/1574893615999201002124021 ◽

2020 ◽

Vol 15 ◽

Author(s):

Deeksha Saxena ◽

Mohammed Haris Siddiqui ◽

Rajnish Kumar

Keyword(s):

Biological Sciences ◽

Machine Learning ◽

Deep Learning ◽

Disease Diagnosis ◽

Learning Models ◽

Data Types ◽

Related Data ◽

Abstract Level ◽

Experimental Validations ◽

Selection Of

Background: Deep learning (DL) is an Artificial neural network-driven framework with multiple levels of representation for which non-linear modules combined in such a way that the levels of representation can be enhanced from lower to a much abstract level. Though DL is used widely in almost every field, it has largely brought a breakthrough in biological sciences as it is used in disease diagnosis and clinical trials. DL can be clubbed with machine learning, but at times both are used individually as well. DL seems to be a better platform than machine learning as the former does not require an intermediate feature extraction and works well with larger datasets. DL is one of the most discussed fields among the scientists and researchers these days for diagnosing and solving various biological problems. However, deep learning models need some improvisation and experimental validations to be more productive. Objective: To review the available DL models and datasets that are used in disease diagnosis. Methods: Available DL models and their applications in disease diagnosis were reviewed discussed and tabulated. Types of datasets and some of the popular disease related data sources for DL were highlighted. Results: We have analyzed the frequently used DL methods, data types and discussed some of the recent deep learning models used for solving different biological problems. Conclusion: The review presents useful insights about DL methods, data types, selection of DL models for the disease diagnosis.

Download Full-text

Empowering Advanced Driver-Assistance Systems from Topological Data Analysis

Mathematics ◽

10.3390/math9060634 ◽

2021 ◽

Vol 9 (6) ◽

pp. 634

Author(s):

Tarek Frahi ◽

Francisco Chinesta ◽

Antonio Falcó ◽

Alberto Badias ◽

Elias Cueto ◽

...

Keyword(s):

Data Analysis ◽

The State ◽

Sensor Data ◽

Topological Data Analysis ◽

Motion Sensors ◽

Driver Assistance Systems ◽

The Road ◽

Topological Features ◽

Recent Developments ◽

Topological Data

We are interested in evaluating the state of drivers to determine whether they are attentive to the road or not by using motion sensor data collected from car driving experiments. That is, our goal is to design a predictive model that can estimate the state of drivers given the data collected from motion sensors. For that purpose, we leverage recent developments in topological data analysis (TDA) to analyze and transform the data coming from sensor time series and build a machine learning model based on the topological features extracted with the TDA. We provide some experiments showing that our model proves to be accurate in the identification of the state of the user, predicting whether they are relaxed or tense.

Download Full-text

Machine learning approach for the search of resonances with topological features at the Large Hadron Collider

International Journal of Modern Physics A ◽

10.1142/s0217751x21502419 ◽

2021 ◽

Author(s):

Salah-Eddine Dahbi ◽

Joshua Choma ◽

Gaogalalwe Mokgatitswane ◽

Xifeng Ruan ◽

Benjamin Lieberman ◽

...

Keyword(s):

Machine Learning ◽

Large Hadron Collider ◽

Hadron Collider ◽

Learning Approach ◽

Topological Features ◽

Machine Learning Approach

Download Full-text

Continuous Selection of Optimized Traffic Light Schedules: A Machine Learning Approach

2018 21st International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc.2018.8569563 ◽

2018 ◽

Author(s):

Shumeet Baluja

Keyword(s):

Machine Learning ◽

Continuous Selection ◽

Learning Approach ◽

Traffic Light ◽

Machine Learning Approach ◽

Selection Of

Download Full-text

Machine learning approach for analyzing complex data from atomic force microscopes

Scilight ◽

10.1063/1.5114991 ◽

2019 ◽

Vol 2019 (25) ◽

pp. 250002

Author(s):

Adam Liebendorfer

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Complex Data ◽

Atomic Force Microscopes ◽

Atomic Force ◽

Machine Learning Approach

Download Full-text

A Machine Learning Approach for Efficient Selection of Enzyme Concentrations and Its Application for Flux Optimization

Catalysts ◽

10.3390/catal10030291 ◽

2020 ◽

Vol 10 (3) ◽

pp. 291 ◽

Cited By ~ 1

Author(s):

Anamya Ajjolli Nagaraja ◽

Philippe Charton ◽

Xavier F. Cadet ◽

Nicolas Fontaine ◽

Mathieu Delsaut ◽

...

Keyword(s):

Machine Learning ◽

Glass Ceiling ◽

Principal Component ◽

Enzyme Concentration ◽

Learning Approach ◽

Neural Network Approach ◽

Free System ◽

Machine Learning Approach ◽

Selection Of

The metabolic engineering of pathways has been used extensively to produce molecules of interest on an industrial scale. Methods like gene regulation or substrate channeling helped to improve the desired product yield. Cell-free systems are used to overcome the weaknesses of engineered strains. One of the challenges in a cell-free system is selecting the optimized enzyme concentration for optimal yield. Here, a machine learning approach is used to select the enzyme concentration for the upper part of glycolysis. The artificial neural network approach (ANN) is known to be inefficient in extrapolating predictions outside the box: high predicted values will bump into a sort of “glass ceiling”. In order to explore this “glass ceiling” space, we developed a new methodology named glass ceiling ANN (GC-ANN). Principal component analysis (PCA) and data classification methods are used to derive a rule for a high flux, and ANN to predict the flux through the pathway using the input data of 121 balances of four enzymes in the upper part of glycolysis. The outcomes of this study are i. in silico selection of optimum enzyme concentrations for a maximum flux through the pathway and ii. experimental in vitro validation of the “out-of-the-box” fluxes predicted using this new approach. Surprisingly, flux improvements of up to 63% were obtained. Gratifyingly, these improvements are coupled with a cost decrease of up to 25% for the assay.

Download Full-text

The Prediction of Essential Medicines Demand: A Machine Learning Approach Using Consumption Data in Rwanda

Processes ◽

10.3390/pr10010026 ◽

2021 ◽

Vol 10 (1) ◽

pp. 26

Author(s):

Francois Mbonyinshuti ◽

Joseph Nkurunziza ◽

Japhet Niyobuhungiro ◽

Egide Kayitare

Keyword(s):

Machine Learning ◽

Random Forest ◽

Predictive Modeling ◽

Essential Medicines ◽

Complex Data ◽

Global Business ◽

Oral Rehydration ◽

Consumption Data ◽

Machine Learning Approach ◽

And Performance

Today’s global business trends are causing a significant and complex data revolution in the healthcare industry, culminating in the use of artificial intelligence and predictive modeling to improve health outcomes and performance. The dataset, which was referred to is based on consumption data from 2015 to 2019, included approximately 500 goods. Based on a series of data pre-processing activities, the top ten (10) essential medicines most used were chosen, namely cotrimoxazole 480 mg, amoxicillin 250 mg, paracetamol 500 mg, oral rehydration salts (O.R.S) sachet 20.5 g, chlorpheniramine 4 mg, nevirapine 200 mg, aminophylline 100 mg, artemether 20 mg + lumefantrine (AL) 120 mg, Cromoglycate ophthalmic. Our study concentrated on the application of machine learning (ML) to forecast future trends in the demand for essential drugs in Rwanda. The following models were created and applied: linear regression, artificial neural network, and random forest. The random forest was able to predict 10 selected medicines with an accuracy of 88 percent with the train set and 76 percent with the test set, and it can thus be used to forecast future demand based on past consumption data by inputting a month, year, district, and medicine name. According to our findings, the random Forest model performed well as a forecasting model for the demand for essential medicines. Finally, data-driven predictive modeling with machine learning (ML) could become the cornerstone of health supply chain planning and operational management.

Download Full-text

Cubical homology-based Image Classification - A Comparative Study

10.36939/ir.202112231202 ◽

2021 ◽

Author(s):

◽

Seungho Choe

Keyword(s):

Machine Learning ◽

Image Classification ◽

Digital Image ◽

Persistent Homology ◽

Topological Data Analysis ◽

Connected Components ◽

Gradient Boosting ◽

Topological Features ◽

Light Gradient ◽

Cubical Homology

Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a score, which measures the significance of each of the sub-simplices in terms of persistence. Also, gray level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as a supplementary method for extracting features. Machine learning techniques are then employed to classify images using the topological signatures. Among the eight tested algorithms with six published image datasets with varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.

Download Full-text

Predictive Model for Selection of Upper Treated Vertebra Using a Machine Learning Approach

World Neurosurgery ◽

10.1016/j.wneu.2020.10.073 ◽

2020 ◽

Author(s):

Renaud Lafage ◽

Bryan Ang ◽

Basel Sheikh Alshabab ◽

Jonathan Elysee ◽

Francis Lovecchio ◽

...

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Learning Approach ◽

Machine Learning Approach ◽

Selection Of

Download Full-text

Predicting the chemical reactivity of organic materials using a machine-learning approach

Chemical Science ◽

10.1039/d0sc01328e ◽

2020 ◽

Vol 11 (30) ◽

pp. 7813-7822 ◽

Cited By ~ 1

Author(s):

Byungju Lee ◽

Jaekyun Yoo ◽

Kisuk Kang

Keyword(s):

Machine Learning ◽

Organic Materials ◽

Chemical Reactivity ◽

Functional Materials ◽

Chemical Components ◽

Learning Approach ◽

System P ◽

Machine Learning Approach ◽

Selection Of

Stability and compatibility between chemical components are essential parameters that need to be considered in the selection of functional materials in configuring a system.

Download Full-text