Large-Scale Land Cover Mapping on Sentinel-1 SAR Imagery Using Deep Transfer Learning

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>

Download Full-text

Machine Learning Models for Abnormality Detection in Musculoskeletal Radiographs

Reports ◽

10.3390/reports2040026 ◽

2019 ◽

Vol 2 (4) ◽

pp. 26 ◽

Cited By ~ 1

Author(s):

Govind Chada

Keyword(s):

Machine Learning ◽

Primary Care ◽

Deep Learning ◽

Transfer Learning ◽

Primary Care Physicians ◽

Screening Tools ◽

Learning Approaches ◽

Limited Data ◽

Learning Models ◽

High Recognition Accuracy

Increasing radiologist workloads and increasing primary care radiology services make it relevant to explore the use of artificial intelligence (AI) and particularly deep learning to provide diagnostic assistance to radiologists and primary care physicians in improving the quality of patient care. This study investigates new model architectures and deep transfer learning to improve the performance in detecting abnormalities of upper extremities while training with limited data. DenseNet-169, DenseNet-201, and InceptionResNetV2 deep learning models were implemented and evaluated on the humerus and finger radiographs from MURA, a large public dataset of musculoskeletal radiographs. These architectures were selected because of their high recognition accuracy in a benchmark study. The DenseNet-201 and InceptionResNetV2 models, employing deep transfer learning to optimize training on limited data, detected abnormalities in the humerus radiographs with 95% CI accuracies of 83–92% and high sensitivities greater than 0.9, allowing for these models to serve as useful initial screening tools to prioritize studies for expedited review. The performance in the case of finger radiographs was not as promising, possibly due to the limitations of large inter-radiologist variation. It is suggested that the causes of this variation be further explored using machine learning approaches, which may lead to appropriate remediation.

Download Full-text

Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-019-00274-0 ◽

2020 ◽

Vol 34 (7) ◽

pp. 717-730 ◽

Cited By ~ 9

Author(s):

Matthew C. Robinson ◽

Robert C. Glen ◽

Alpha A. Lee

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Numerical Experiments ◽

Large Scale ◽

Operating Characteristic ◽

Characteristic Curve ◽

Learning Models ◽

Bioactivity Prediction ◽

Operating Characteristic Curve ◽

Machine Learning Models

Abstract Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening. We further suggest that area under the precision–recall curve should be used in conjunction with the receiver operating characteristic curve. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.

Download Full-text

Advanced Machine Learning Models for Large Scale Gene Expression Analysis in Cancer Classification: Deep Learning Versus Classical Models

Communications in Computer and Information Science - Big Data, Cloud and Applications ◽

10.1007/978-3-319-96292-4_17 ◽

2018 ◽

pp. 210-221

Author(s):

Imene Zenbout ◽

Souham Meshoul

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Deep Learning ◽

Expression Analysis ◽

Large Scale ◽

Gene Expression Analysis ◽

Cancer Classification ◽

Learning Models ◽

Classical Models ◽

Machine Learning Models

Download Full-text

Wide-Area Land Cover Mapping with Sentinel-1 Imagery using Deep Learning Semantic Segmentation Models

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ◽

10.1109/jstars.2021.3116094 ◽

2021 ◽

pp. 1-1

Author(s):

Sanja Scepanovic ◽

Oleg Antropov ◽

Pekka Laurila ◽

Yrjo Akseli Rauste ◽

Vladimir Ignatenko ◽

...

Keyword(s):

Deep Learning ◽

Land Cover ◽

Semantic Segmentation ◽

Wide Area ◽

Land Cover Mapping ◽

Segmentation Models

Download Full-text

Deep learning approach for large scale land cover mapping based on remote sensing data fusion

2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) ◽

10.1109/igarss.2016.7729043 ◽

2016 ◽

Cited By ~ 14

Author(s):

Nataliia Kussul ◽

Andrii Shelestov ◽

Mykola Lavreniuk ◽

Igor Butko ◽

Sergii Skakun

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Land Cover ◽

Data Fusion ◽

Large Scale ◽

Remote Sensing Data ◽

Land Cover Mapping ◽

Learning Approach ◽

Sensing Data ◽

Large Scale Land

Download Full-text

A Physics-Infused Deep Learning Model for the Prediction of Refractive Indices and Its Use for the Large-Scale Screening of Organic Compound Space

10.26434/chemrxiv.8796950.v1 ◽

2019 ◽

Author(s):

Mojtaba Haghighatlari ◽

Gaurav Vishwakarma ◽

Mohammad Atif Faiz Afzal ◽

Johannes Hachmann

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Organic Molecules ◽

Learning Model ◽

Training Data ◽

Refractive Indices ◽

Learning Models ◽

Deep Learning Model ◽

Machine Learning Models

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>

Download Full-text

Transfer Learning for Tabular Data

10.36227/techrxiv.16974124.v1 ◽

2021 ◽

Author(s):

Leonid Joffe

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Deep Learning ◽

Transfer Learning ◽

Internal Representation ◽

The Other ◽

Learning Models ◽

Tabular Data ◽

Convolutional Network ◽

Feature Interactions

Deep learning models for tabular data are restricted to a specific table format. Computer vision models, on the other hand, have a broader applicability; they work on all images and can learn universal features. This allows them to be trained on enormous corpora and have very wide transferability and applicability. Inspired by these properties, this work presents an architecture that aims to capture useful patterns across arbitrary tables. The model is trained on randomly sampled subsets of features from a table, processed by a convolutional network. This internal representation captures feature interactions that appear in the table. Experimental results show that the embeddings produced by this model are useful and transferable across many commonly used machine learning benchmarks datasets. Specifically, that using the embeddings produced by the network as additional features, improves the performance of a number of classifiers.

Download Full-text

Transfer Learning for Tabular Data

10.36227/techrxiv.16974124 ◽

2021 ◽

Author(s):

Leonid Joffe

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Deep Learning ◽

Transfer Learning ◽

Internal Representation ◽

The Other ◽

Learning Models ◽

Tabular Data ◽

Convolutional Network ◽

Feature Interactions

Deep learning models for tabular data are restricted to a specific table format. Computer vision models, on the other hand, have a broader applicability; they work on all images and can learn universal features. This allows them to be trained on enormous corpora and have very wide transferability and applicability. Inspired by these properties, this work presents an architecture that aims to capture useful patterns across arbitrary tables. The model is trained on randomly sampled subsets of features from a table, processed by a convolutional network. This internal representation captures feature interactions that appear in the table. Experimental results show that the embeddings produced by this model are useful and transferable across many commonly used machine learning benchmarks datasets. Specifically, that using the embeddings produced by the network as additional features, improves the performance of a number of classifiers.

Download Full-text

Machine Learning with ROOT/TMVA

EPJ Web of Conferences ◽

10.1051/epjconf/202024506019 ◽

2020 ◽

Vol 245 ◽

pp. 06019

Author(s):

Kim Albertsson ◽

Sitong An ◽

Sergei Gleyzer ◽

Lorenzo Moneta ◽

Joana Niermann ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Learning Tools ◽

Learning Models ◽

New Developments ◽

Training Time ◽

Future Developments ◽

Machine Learning Models

ROOT provides, through TMVA, machine learning tools for data analysis at HEP experiments and beyond. We present recently included features in TMVA and the strategy for future developments in the diversified machine learning landscape. Focus is put on fast machine learning inference, which enables analysts to deploy their machine learning models rapidly on large scale datasets. The new developments are paired with newly designed C++ and Python interfaces supporting modern C++ paradigms and full interoperability in the Python ecosystem. We present as well a new deep learning implementation for convolutional neural network using the cuDNN library for GPU. We show benchmarking results in term of training time and inference time, when comparing with other machine learning libraries such as Keras/Tensorflow.

Download Full-text