PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Automated clinical computational biology: an interpretable machine learning framework to predict disease severity and stratify patients from clinical data

10.31219/osf.io/9xc2j ◽

2018 ◽

Author(s):

soumya banerjee

Keyword(s):

Machine Learning ◽

Disease Severity ◽

Clinical Data ◽

Model Building ◽

Learning Experience ◽

Machine Learning Algorithms ◽

Close Collaboration ◽

Learning Framework ◽

Novel Biomarkers ◽

Automated Machine Learning

We outline an automated computational and machine learning framework that predicts disease severity andstratifies patients. We apply our framework to available clinical data. Our algorithm automatically generatesinsights and predicts disease severity with minimal operator intervention. The computational frameworkpresented here can be used to stratify patients, predict disease severity and propose novel biomarkers fordisease. Insights from machine learning algorithms coupled with clinical data may help guide therapy,personalize treatment and help clinicians understand the change in disease over time. Computationaltechniques like these can be used in translational medicine in close collaboration with clinicians and healthcareproviders. Our models are also interpretable, allowing clinicians with minimal machine learning experience toengage in model building. This work is a step towards automated machine learning in the clinic.

Download Full-text

Cardea: An Open Automated Machine Learning Framework for Electronic Health Records

2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) ◽

10.1109/dsaa49011.2020.00068 ◽

2020 ◽

Author(s):

Sarah Alnegheimish ◽

Najat Alrashed ◽

Faisal Aleissa ◽

Shahad Althobaiti ◽

Dongyu Liu ◽

...

Keyword(s):

Machine Learning ◽

Electronic Health Records ◽

Health Records ◽

Learning Framework ◽

Automated Machine Learning ◽

Electronic Health

Download Full-text

Machine Learning Tools For off-Target Early Safety Assessment of Small Molecules In Drug Discovery (Single Task Neural Networks Vs Automated Machine Learning)

10.21203/rs.3.rs-957525/v1 ◽

2021 ◽

Author(s):

Doha Naga ◽

Wolfgang Muster ◽

Eunice Musvasva ◽

Gerhard F. Ecker

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Pharmaceutical Companies ◽

Learning Tools ◽

Safety Issues ◽

Learning Framework ◽

Automated Machine Learning ◽

And Performance ◽

Preclinical Safety

Abstract Unpredicted drug safety issues constitute the majority of failures in the pharmaceutical industry according to several studies[1-3]. Some of these preclinical safety issues could be attributed to the non-selective binding of compounds to targets other than their intended therapeutic target, causing undesired adverse events. Consequently, pharmaceutical companies including Roche, routinely run in-vitro safety screens to detect off-target activities prior to preclinical and clinical studies.Hereby we present a machine learning framework aiming at the prediction of our in-house 50 off-target panel[4] activities for ~ 4000 compounds, directly from their structure. This framework is intended to guide chemists in the drug design process prior to synthesis and accelerate drug discovery. It incorporates different ML approaches such as deep learning and automated machine learning. Outcomes from different methods are compared in terms of efficiency and efficacy. The most important challenges and factors impacting model construction and performance in addition to suggestions on how to overcome such challenges are also discussed.

Download Full-text

Towards intelligent geospatial data discovery: a machine learning framework for search ranking

International Journal of Digital Earth ◽

10.1080/17538947.2017.1371255 ◽

2017 ◽

Vol 11 (9) ◽

pp. 956-971 ◽

Cited By ~ 8

Author(s):

Yongyao Jiang ◽

Yun Li ◽

Chaowei Yang ◽

Fei Hu ◽

Edward M. Armstrong ◽

...

Keyword(s):

Machine Learning ◽

Geospatial Data ◽

Data Discovery ◽

Learning Framework

Download Full-text

A Scalable and Automated Machine Learning Framework to Support Risk Management

Lecture Notes in Computer Science - Agents and Artificial Intelligence ◽

10.1007/978-3-030-71158-0_14 ◽

2021 ◽

pp. 291-307

Author(s):

Luís Ferreira ◽

André Pilastri ◽

Carlos Martins ◽

Pedro Santos ◽

Paulo Cortez

Keyword(s):

Machine Learning ◽

Risk Management ◽

Learning Framework ◽

Automated Machine Learning

Download Full-text

D-SmartML: A Distributed Automated Machine Learning Framework

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS) ◽

10.1109/icdcs47774.2020.00115 ◽

2020 ◽

Author(s):

Ahmed Abd Elrahman ◽

Mohamed El Helw ◽

Radwa Elshawi ◽

Sherif Sakr

Keyword(s):

Machine Learning ◽

Learning Framework ◽

Automated Machine Learning

Download Full-text

Earth Model Building in Real-Time with an Automated Machine Learning Framework - A Midland Basin Example

Proceedings of the 9th Unconventional Resources Technology Conference ◽

10.15530/urtec-2021-5659 ◽

2021 ◽

Author(s):

Altay Sansal ◽

Muhlis Unaldi ◽

Edward Tian ◽

Gareth Taylor

Keyword(s):

Machine Learning ◽

Real Time ◽

Model Building ◽

Earth Model ◽

Midland Basin ◽

Learning Framework ◽

Automated Machine Learning

Download Full-text

Automated Machine-Learning Framework Integrating Histopathological and Radiological Information for Predicting IDH1 Mutation Status in Glioma

Frontiers in Bioinformatics ◽

10.3389/fbinf.2021.718697 ◽

2021 ◽

Vol 1 ◽

Author(s):

Dingqian Wang ◽

Cuicui Liu ◽

Xiuying Wang ◽

Xuejun Liu ◽

Chuanjin Lan ◽

...

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Frozen Section ◽

Imaging Features ◽

Radiological Imaging ◽

Learning Framework ◽

Inversion Recovery Imaging ◽

Level Information ◽

Mri Sequences ◽

Automated Machine Learning

Diffuse gliomas are the most common malignant primary brain tumors. Identification of isocitrate dehydrogenase 1 (IDH1) mutations aids the diagnostic classification of these tumors and the prediction of their clinical outcomes. While histology continues to play a key role in frozen section diagnosis, as a diagnostic reference and as a method for monitoring disease progression, recent research has demonstrated the ability of multi-parametric magnetic resonance imaging (MRI) sequences for predicting IDH genotypes. In this paper, we aim to improve the prediction accuracy of IDH1 genotypes by integrating multi-modal imaging information from digitized histopathological data derived from routine histological slide scans and the MRI sequences including T1-contrast (T1) and Fluid-attenuated inversion recovery imaging (T2-FLAIR). In this research, we have established an automated framework to process, analyze and integrate the histopathological and radiological information from high-resolution pathology slides and multi-sequence MRI scans. Our machine-learning framework comprehensively computed multi-level information including molecular level, cellular level, and texture level information to reflect predictive IDH genotypes. Firstly, an automated pre-processing was developed to select the regions of interest (ROIs) from pathology slides. Secondly, to interactively fuse the multimodal complementary information, comprehensive feature information was extracted from the pathology ROIs and segmented tumor regions (enhanced tumor, edema and non-enhanced tumor) from MRI sequences. Thirdly, a Random Forest (RF)-based algorithm was employed to identify and quantitatively characterize histopathological and radiological imaging origins, respectively. Finally, we integrated multi-modal imaging features with a machine-learning algorithm and tested the performance of the framework for IDH1 genotyping, we also provided visual and statistical explanation to support the understanding on prediction outcomes. The training and testing experiments on 217 pathologically verified IDH1 genotyped glioma cases from multi-resource validated that our fully automated machine-learning model predicted IDH1 genotypes with greater accuracy and reliability than models that were based on radiological imaging data only. The accuracy of IDH1 genotype prediction was 0.90 compared to 0.82 for radiomic result. Thus, the integration of multi-parametric imaging features for automated analysis of cross-modal biomedical data improved the prediction accuracy of glioma IDH1 genotypes.

Download Full-text

Geospatial Data Analytics : A Machine Learning Perspective

SSRN Electronic Journal ◽

10.2139/ssrn.3599656 ◽

2020 ◽

Author(s):

Saneev Kumar Das ◽

Meenakshi Pant

Keyword(s):

Machine Learning ◽

Data Analytics ◽

Geospatial Data

Download Full-text

PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

Automated clinical computational biology: an interpretable machine learning framework to predict disease severity and stratify patients from clinical data

Cardea: An Open Automated Machine Learning Framework for Electronic Health Records

Machine Learning Tools For off-Target Early Safety Assessment of Small Molecules In Drug Discovery (Single Task Neural Networks Vs Automated Machine Learning)

Towards intelligent geospatial data discovery: a machine learning framework for search ranking

A Scalable and Automated Machine Learning Framework to Support Risk Management

D-SmartML: A Distributed Automated Machine Learning Framework

Earth Model Building in Real-Time with an Automated Machine Learning Framework - A Midland Basin Example

Automated Machine-Learning Framework Integrating Histopathological and Radiological Information for Predicting IDH1 Mutation Status in Glioma

Geospatial Data Analytics : A Machine Learning Perspective

Export Citation Format