scholarly journals AMLBID: An auto-explained Automated Machine Learning tool for Big Industrial Data

SoftwareX ◽  
2022 ◽  
Vol 17 ◽  
pp. 100919
Author(s):  
Moncef Garouani ◽  
Adeel Ahmad ◽  
Mourad Bouneffa ◽  
Mohamed Hamlich
Lab on a Chip ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 2166-2174
Author(s):  
Hanfei Shen ◽  
Tony Liu ◽  
Jesse Cui ◽  
Piyush Borole ◽  
Ari Benjamin ◽  
...  

We have developed a web-based, self-improving and overfitting-resistant automated machine learning tool tailored specifically for liquid biopsy data, where machine learning models can be built without the user's input.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e13555-e13555
Author(s):  
Wei Zhou ◽  
Ji He

e13555 Background: Survival analysis is used to establish a connection between covariates and the time of event with censored data. Compared with traditional statistical methods, machine learning approaches based on sophisticated and effective computational algorithms are more capable for handling complex multi-dimensional medical data. Methods: We developed an automated machine learning tool MLsurvival to analyze survival data of cancer patients, algorithms of which include the statistical cox regression and machine learning based on linear model (elastic net), ensemble model (gradient boosting with least squares or regression trees and random forest) and support vector kernel (linear and non-linear). The workflow of MLsurvival is comprised with four modules: preprocessing (missing data remove or imputation and feature standardization), feature selection (unsupervised multi-statistics and supervised machine recursive feature elimination with cross-validation), modeling (hyperparameter and performance evaluation) and prediction. To evaluate the performance of this tool, we analyzed medical data for 222 hepatocellular carcinoma (HCC) patients at stage II-III who underwent surgical resection and developed five machine learning approach based estimation models for overall survival (OS). Models were trained on 155 patients with 300 features, including clinical information, somatic mutation and copy number variation, and independently validated on the rest 67 patients. Results: The ensemble model of gradient boosting fitted by MLsurvival using 48 selected features for the data of 155 HCC patients possessed the highest mean AUC and C-Index value. For 67 patients in validation set, this model predicted half year mortality of patients with an AUC of 0.9 (95% CI, 0.771-1.029) and one year mortality with an AUC of 0.897 (95% CI, 0.816-0.978). In addition to that, this model was also predictive for the time of recurrence (pvalue < 0.0001). Furthermore, we also utilized this tool in survival analysis for extensive real data from patients with breast, lung, and esophagus cancers, while most of results showed superior accuracy and stable performance. Conclusions: MLsurvial is an automate tool for survival analysis of cancer patients with well performance. The risk scoring system implemented in this tool offers a novel strategy for incorporating multi-dimensional risk factors to predict clinical outcome, contributes to the better understanding of disease background and helps to optimize the clinical follow-up and therapeutic treatment for cancer patients.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Scott Broderick ◽  
Ruhil Dongol ◽  
Tianmu Zhang ◽  
Krishna Rajan

AbstractThis paper introduces the use of topological data analysis (TDA) as an unsupervised machine learning tool to uncover classification criteria in complex inorganic crystal chemistries. Using the apatite chemistry as a template, we track through the use of persistent homology the topological connectivity of input crystal chemistry descriptors on defining similarity between different stoichiometries of apatites. It is shown that TDA automatically identifies a hierarchical classification scheme within apatites based on the commonality of the number of discrete coordination polyhedra that constitute the structural building units common among the compounds. This information is presented in the form of a visualization scheme of a barcode of homology classifications, where the persistence of similarity between compounds is tracked. Unlike traditional perspectives of structure maps, this new “Materials Barcode” schema serves as an automated exploratory machine learning tool that can uncover structural associations from crystal chemistry databases, as well as to achieve a more nuanced insight into what defines similarity among homologous compounds.


PLoS ONE ◽  
2018 ◽  
Vol 13 (11) ◽  
pp. e0206409 ◽  
Author(s):  
Stephen Solis-Reyes ◽  
Mariano Avino ◽  
Art Poon ◽  
Lila Kari

Sign in / Sign up

Export Citation Format

Share Document