A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v1 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Using distant supervision to augment manually annotated data for relation extraction

10.1101/626226 ◽

2019 ◽

Author(s):

Peng Su ◽

Gang Li ◽

Cathy Wu ◽

K. Vijay-Shanker

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Relation Extraction ◽

Biomedical Literature ◽

Training Data ◽

Distant Supervision ◽

Large Size ◽

Domain Expertise

AbstractSignificant progress has been made in applying deep learning on natural language processing tasks recently. However, deep learning models typically require a large amount of annotated training data while often only small labeled datasets are available for many natural language processing tasks in biomedical literature. Building large-size datasets for deep learning is expensive since it involves considerable human effort and usually requires domain expertise in specialized fields. In this work, we consider augmenting manually annotated data with large amounts of data using distant supervision. However, data obtained by distant supervision is often noisy, we first apply some heuristics to remove some of the incorrect annotations. Then using methods inspired from transfer learning, we show that the resulting models outperform models trained on the original manually annotated sets.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v3 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

npj Digital Medicine ◽

10.1038/s41746-021-00455-y ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Laila Rasmy ◽

Yang Xiang ◽

Ziqian Xie ◽

Cui Tao ◽

Degui Zhi

Keyword(s):

Deep Learning ◽

Electronic Health Records ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Disease Prediction ◽

Operating Characteristics ◽

Health Records ◽

Clinical Databases ◽

Electronic Health

AbstractDeep learning (DL)-based predictive models from electronic health records (EHRs) deliver impressive performance in many clinical tasks. Large training cohorts, however, are often required by these models to achieve high accuracy, hindering the adoption of DL-based models in scenarios with limited training data. Recently, bidirectional encoder representations from transformers (BERT) and related models have achieved tremendous successes in the natural language processing domain. The pretraining of BERT on a very large training corpus generates contextualized embeddings that can boost the performance of models trained on smaller datasets. Inspired by BERT, we propose Med-BERT, which adapts the BERT framework originally developed for the text domain to the structured EHR domain. Med-BERT is a contextualized embedding model pretrained on a structured EHR dataset of 28,490,650 patients. Fine-tuning experiments showed that Med-BERT substantially improves the prediction accuracy, boosting the area under the receiver operating characteristics curve (AUC) by 1.21–6.14% in two disease prediction tasks from two clinical databases. In particular, pretrained Med-BERT obtains promising performances on tasks with small fine-tuning training sets and can boost the AUC by more than 20% or obtain an AUC as high as a model trained on a training set ten times larger, compared with deep learning models without Med-BERT. We believe that Med-BERT will benefit disease prediction studies with small local training datasets, reduce data collection expenses, and accelerate the pace of artificial intelligence aided healthcare.

Download Full-text

Transfer Learning for Inference of Metastatic Origin from Whole Slide Histology

10.1101/2021.04.21.440864 ◽

2021 ◽

Author(s):

Geoffrey F. Schau ◽

Hassan Ghani ◽

Erik A. Burlingame ◽

Guillaume Thibault ◽

Joe W. Gray ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Primary Tumor ◽

Large Scale ◽

Metastatic Cancer ◽

Clinical Diagnostics ◽

Training Data ◽

Fine Tuning ◽

Learning Approach ◽

Whole Slide Images

AbstractAccurate diagnosis of metastatic cancer is essential for prescribing optimal control strategies to halt further spread of metastasizing disease. While pathological inspection aided by immunohistochemistry staining provides a valuable gold standard for clinical diagnostics, deep learning methods have emerged as powerful tools for identifying clinically relevant features of whole slide histology relevant to a tumor’s metastatic origin. Although deep learning models require significant training data to learn effectively, transfer learning paradigms provide mechanisms to circumvent limited training data by first training a model on related data prior to fine-tuning on smaller data sets of interest. In this work we propose a transfer learning approach that trains a convolutional neural network to infer the metastatic origin of tumor tissue from whole slide images of hematoxylin and eosin (H&E) stained tissue sections and illustrate the advantages of pre-training network on whole slide images of primary tumor morphology. We further characterize statistical dissimilarity between primary and metastatic tumors of various indications on patch-level images to highlight limitations of our indication-specific transfer learning approach. Using a primary-to-metastatic transfer learning approach, we achieved mean class-specific areas under receiver operator characteristics curve (AUROC) of 0.779, which outperformed comparable models trained on only images of primary tumor (mean AUROC of 0.691) or trained on only images of metastatic tumor (mean AUROC of 0.675), supporting the use of large scale primary tumor imaging data in developing computer vision models to characterize metastatic origin of tumor lesions.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v2 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Applying Deep Learning Technique for Depression Classification in Social Media Text

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.3169 ◽

2020 ◽

Vol 10 (10) ◽

pp. 2446-2451

Author(s):

Hussain Ahmad ◽

Muhammad Zubair Asghar ◽

Fahad M. Alotaibi ◽

Ibrahim A. Hameed

Keyword(s):

Social Media ◽

Deep Learning ◽

Large Scale ◽

Research Area ◽

Mental Illnesses ◽

Training Data ◽

Supervised Machine Learning ◽

Learning Approaches ◽

Close Relationship ◽

Social Media Platforms

In social media, depression identification could be regarded as a complex task because of the complicated nature associated with mental disorders. In recent times, there has been an evolution in this research area with growing popularity of social media platforms as these have become a fundamental part of people's day-to-day life. Social media platforms and their users share a close relationship due to which the users' personal life is reflected in these platforms on several levels. Apart from the associated complexity in recognising mental illnesses via social media platforms, implementing supervised machine learning approaches like deep neural networks is yet to be adopted in a large scale because of the inherent difficulties associated with procuring sufficient quantities of annotated training data. Because of such reasons, we have made effort to identify deep learning model that is most effective from amongst selected architectures with previous successful record in supervised learning methods. The selected model is employed to recognise online users that display depression; since there is limited unstructured text data that could be extracted from Twitter.

Download Full-text

Deep Embedding Sentiment Analysis on Product Reviews Using Naive Bayesian Classifier

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952178 ◽

2019 ◽

pp. 858-864

Author(s):

Nukabathini Mary Saroj Sahithya ◽

Manda Prathyusha ◽

Nakkala Rachana ◽

Perikala Priyanka ◽

P. J. Jyothi

Keyword(s):

Deep Learning ◽

Language Processing ◽

Large Scale ◽

Opinion Mining ◽

Machine Learning Algorithms ◽

Sentiment Classification ◽

Training Data ◽

Fine Tuning ◽

Product Reviews ◽

Deep Embedding

Product reviews are valuable for upcoming buyers in helping them make decisions. To this end, different opinion mining techniques have been proposed, where judging a review sentence�s orientation (e.g. positive or negative) is one of their key challenges. Recently, deep learning has emerged as an effective means for solving sentiment classification problems. Deep learning is a class of machine learning algorithms that learn in supervised and unsupervised manners. A neural network intrinsically learns a useful representation automatically without human efforts. However, the success of deep learning highly relies on the large-scale training data. We propose a novel deep learning framework for product review sentiment classification which employs prevalently available ratings supervision signals. The framework consists of two steps: (1) learning a high-level representation (an embedding space) which captures the general sentiment distribution of sentences through rating information; (2) adding a category layer on top of the embedding layer and use labelled sentences for supervised fine-tuning. We explore two kinds of low-level network structure for modelling review sentences, namely, convolutional function extractors and long temporary memory. Convolutional layer is the core building block of a CNN and it consists of kernels. Applications are image and video recognition, natural language processing, image classification

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Automatic Identification of Overpass Structures: A Method of Deep Learning

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8090421 ◽

2019 ◽

Vol 8 (9) ◽

pp. 421 ◽

Cited By ~ 1

Author(s):

Hao Li ◽

Maosheng Hu ◽

Youxin Huang

Keyword(s):

Deep Learning ◽

Target Detection ◽

Road Network ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Automatic Identification ◽

Detection Model ◽

Network Pattern ◽

Accuracy Performance

The identification of overpass structures in road networks has great significance for multi-scale modeling of roads, congestion analysis, and vehicle navigation. The traditional vector-based methods identify overpasses by the methodologies coming from computational geometry and graph theory, and they overly rely on the artificially designed features and have poor adaptability to complex scenes. This paper presents a novel method of identifying overpasses based on a target detection model (Faster-RCNN). This method utilizes raster representation of vector data and convolutional neural networks (CNNs) to learn task adaptive features from raster data, then identifies the location of an overpass by a Region Proposal network (RPN). The contribution of this paper is: (1) An overpass labelling geodatabase (OLGDB) for the OpenStreetMap (OSM) road network data of six typical cities in China is established; (2) Three different CNNs (ZF-net, VGG-16, Inception-ResNet V2) are integrated into Faster-RCNN and evaluated by accuracy performance; (3) The optimal combination of learning rate and batchsize is determined by fine-tuning; and (4) Five geometric metrics (perimeter, area, squareness, circularity, and W/L) are synthetized into image bands to enhance the training data, and their contribution to the overpass identification task is determined. The experimental results have shown that the proposed method has good accuracy performance (around 90%), and could be improved with the expansion of OLGDB and switching to more sophisticated target detection models. The deep learning target detection model has great application potential in large-scale road network pattern recognition, it can task-adaptively learn road structure features and easily extend to other road network patterns.

Download Full-text