scholarly journals Better Fine-Tuning via Instance Weighting for Text Classification

Author(s):  
Zhi Wang ◽  
Wei Bi ◽  
Yan Wang ◽  
Xiaojiang Liu

Transfer learning for deep neural networks has achieved great success in many text classification applications. A simple yet effective transfer learning method is to fine-tune the pretrained model parameters. Previous fine-tuning works mainly focus on the pre-training stage and investigate how to pretrain a set of parameters that can help the target task most. In this paper, we propose an Instance Weighting based Finetuning (IW-Fit) method, which revises the fine-tuning stage to improve the final performance on the target domain. IW-Fit adjusts instance weights at each fine-tuning epoch dynamically to accomplish two goals: 1) identify and learn the specific knowledge of the target domain effectively; 2) well preserve the shared knowledge between the source and the target domains. The designed instance weighting metrics used in IW-Fit are model-agnostic, which are easy to implement for general DNN-based classifiers. Experimental results show that IW-Fit can consistently improve the classification accuracy on the target domain.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fangzhou Xu ◽  
Yunjing Miao ◽  
Yanan Sun ◽  
Dongju Guo ◽  
Jiali Xu ◽  
...  

AbstractDeep learning networks have been successfully applied to transfer functions so that the models can be adapted from the source domain to different target domains. This study uses multiple convolutional neural networks to decode the electroencephalogram (EEG) of stroke patients to design effective motor imagery (MI) brain-computer interface (BCI) system. This study has introduced ‘fine-tune’ to transfer model parameters and reduced training time. The performance of the proposed framework is evaluated by the abilities of the models for two-class MI recognition. The results show that the best framework is the combination of the EEGNet and ‘fine-tune’ transferred model. The average classification accuracy of the proposed model for 11 subjects is 66.36%, and the algorithm complexity is much lower than other models.These good performance indicate that the EEGNet model has great potential for MI stroke rehabilitation based on BCI system. It also successfully demonstrated the efficiency of transfer learning for improving the performance of EEG-based stroke rehabilitation for the BCI system.


2020 ◽  
Author(s):  
Xinhao Li ◽  
Denis Fourches

<p>Deep neural networks can directly learn from chemical structures without extensive, user-driven selection of descriptors in order to predict molecular properties/activities with high reliability. But these approaches typically require large training sets to learn the endpoint-specific structural features and ensure reasonable prediction accuracy. Even though large datasets are becoming the new normal in drug discovery, especially when it comes to high-throughput screening or metabolomics datasets, one should also consider smaller datasets with challenging endpoints to model and forecast. Thus, it would be highly relevant to better utilize the tremendous compendium of unlabeled compounds from publicly-available datasets for improving the model performances for the user’s particular series of compounds. In this study, we propose the <b>Mol</b>ecular <b>P</b>rediction <b>Mo</b>del <b>Fi</b>ne-<b>T</b>uning (<b>MolPMoFiT</b>) approach, an effective transfer learning method based on self-supervised pre-training + task-specific fine-tuning for QSPR/QSAR modeling. A large-scale molecular structure prediction model is pre-trained using one million unlabeled molecules from ChEMBL in a self-supervised learning manner, and can then be fine-tuned on various QSPR/QSAR tasks for smaller chemical datasets with specific endpoints. Herein, the method is evaluated on four benchmark datasets (lipophilicity, FreeSolv, HIV, and blood-brain barrier penetration). The results showed the method can achieve strong performances for all four datasets compared to other state-of-the-art machine learning modeling techniques reported in the literature so far. <br></p>


2021 ◽  
Vol 18 (2) ◽  
pp. 56-65
Author(s):  
Marcelo Romero ◽  
◽  
Matheus Gutoski ◽  
Leandro Takeshi Hattori ◽  
Manassés Ribeiro ◽  
...  

Transfer learning is a paradigm that consists in training and testing classifiers with datasets drawn from distinct distributions. This technique allows to solve a particular problem using a model that was trained for another purpose. In the recent years, this practice has become very popular due to the increase of public available pre-trained models that can be fine-tuned to be applied in different scenarios. However, the relationship between the datasets used for training the model and the test data is usually not addressed, specially where the fine-tuning process is done only for the fully connected layers of a Convolutional Neural Network with pre-trained weights. This work presents a study regarding the relationship between the datasets used in a transfer learning process in terms of the performance achieved by models complexities and similarities. For this purpose, we fine-tune the final layer of Convolutional Neural Networks with pre-trained weights using diverse soft biometrics datasets. An evaluation of the performances of the models, when tested with datasets that are different from the one used for training the model, is presented. Complexity and similarity metrics are also used to perform the evaluation.


Author(s):  
Saheb Chhabra ◽  
Puspita Majumdar ◽  
Mayank Vatsa ◽  
Richa Singh

In real-world applications, commercial off-the-shelf systems are utilized for performing automated facial analysis including face recognition, emotion recognition, and attribute prediction. However, a majority of these commercial systems act as black boxes due to the inaccessibility of the model parameters which makes it challenging to fine-tune the models for specific applications. Stimulated by the advances in adversarial perturbations, this research proposes the concept of Data Fine-tuning to improve the classification accuracy of a given model without changing the parameters of the model. This is accomplished by modeling it as data (image) perturbation problem. A small amount of “noise” is added to the input with the objective of minimizing the classification loss without affecting the (visual) appearance. Experiments performed on three publicly available datasets LFW, CelebA, and MUCT, demonstrate the effectiveness of the proposed concept.


Author(s):  
Fouzia Altaf ◽  
Syed M. S. Islam ◽  
Naeem Khalid Janjua

AbstractDeep learning has provided numerous breakthroughs in natural imaging tasks. However, its successful application to medical images is severely handicapped with the limited amount of annotated training data. Transfer learning is commonly adopted for the medical imaging tasks. However, a large covariant shift between the source domain of natural images and target domain of medical images results in poor transfer learning. Moreover, scarcity of annotated data for the medical imaging tasks causes further problems for effective transfer learning. To address these problems, we develop an augmented ensemble transfer learning technique that leads to significant performance gain over the conventional transfer learning. Our technique uses an ensemble of deep learning models, where the architecture of each network is modified with extra layers to account for dimensionality change between the images of source and target data domains. Moreover, the model is hierarchically tuned to the target domain with augmented training data. Along with the network ensemble, we also utilize an ensemble of dictionaries that are based on features extracted from the augmented models. The dictionary ensemble provides an additional performance boost to our method. We first establish the effectiveness of our technique with the challenging ChestXray-14 radiography data set. Our experimental results show more than 50% reduction in the error rate with our method as compared to the baseline transfer learning technique. We then apply our technique to a recent COVID-19 data set for binary and multi-class classification tasks. Our technique achieves 99.49% accuracy for the binary classification, and 99.24% for multi-class classification.


2021 ◽  
Vol 12 ◽  
Author(s):  
Lei Feng ◽  
Baohua Wu ◽  
Yong He ◽  
Chu Zhang

Various rice diseases threaten the growth of rice. It is of great importance to achieve the rapid and accurate detection of rice diseases for precise disease prevention and control. Hyperspectral imaging (HSI) was performed to detect rice leaf diseases in four different varieties of rice. Considering that it costs much time and energy to develop a classifier for each variety of rice, deep transfer learning was firstly introduced to rice disease detection across different rice varieties. Three deep transfer learning methods were adapted for 12 transfer tasks, namely, fine-tuning, deep CORrelation ALignment (CORAL), and deep domain confusion (DDC). A self-designed convolutional neural network (CNN) was set as the basic network of the deep transfer learning methods. Fine-tuning achieved the best transferable performance with an accuracy of over 88% for the test set of the target domain in the majority of transfer tasks. Deep CORAL obtained an accuracy of over 80% in four of all the transfer tasks, which was superior to that of DDC. A multi-task transfer strategy has been explored with good results, indicating the potential of both pair-wise, and multi-task transfers. A saliency map was used for the visualization of the key wavelength range captured by CNN with and without transfer learning. The results indicated that the wavelength range with and without transfer learning was overlapped to some extent. Overall, the results suggested that deep transfer learning methods could perform rice disease detection across different rice varieties. Hyperspectral imaging, in combination with the deep transfer learning method, is a promising possibility for the efficient and cost-saving field detection of rice diseases among different rice varieties.


2021 ◽  
Vol 5 (2) ◽  
pp. 81-91
Author(s):  
Elok Iedfitra Haksoro ◽  
Abas Setiawan

Not all mushrooms are edible because some are poisonous. The edible or poisonous mushrooms can be identified by paying attention to the morphological characteristics of mushrooms, such as shape, color, and texture. There is an issue: some poisonous mushrooms have morphological features that are very similar to edible mushrooms. It can lead to the misidentification of mushrooms. This work aims to recognize edible or poisonous mushrooms using a Deep Learning approach, typically Convolutional Neural Networks. Because the training process will take a long time, Transfer Learning was applied to accelerate the learning process. Transfer learning uses an existing model as a base model in our neural network by transferring information from the related domain. There are Four base models are used, namely MobileNets, MobileNetV2, ResNet50, and VGG19. Each base model will be subjected to several experimental scenarios, such as setting the different learning rate values for pre-training and fine-tuning. The results show that the Convolutional Neural Network with transfer learning method can recognize edible or poisonous mushrooms with more than 86% accuracy. Moreover, the best accuracy result is 92.19% obtained from the base model of MobileNetsV2 with a learning rate of 0,00001 at the pre-training stage and 0,0001 at the fine-tuning stage.


2020 ◽  
Vol 10 (10) ◽  
pp. 3359 ◽  
Author(s):  
Ibrahem Kandel ◽  
Mauro Castelli

Accurate classification of medical images is of great importance for correct disease diagnosis. The automation of medical image classification is of great necessity because it can provide a second opinion or even a better classification in case of a shortage of experienced medical staff. Convolutional neural networks (CNN) were introduced to improve the image classification domain by eliminating the need to manually select which features to use to classify images. Training CNN from scratch requires very large annotated datasets that are scarce in the medical field. Transfer learning of CNN weights from another large non-medical dataset can help overcome the problem of medical image scarcity. Transfer learning consists of fine-tuning CNN layers to suit the new dataset. The main questions when using transfer learning are how deeply to fine-tune the network and what difference in generalization that will make. In this paper, all of the experiments were done on two histopathology datasets using three state-of-the-art architectures to systematically study the effect of block-wise fine-tuning of CNN. Results show that fine-tuning the entire network is not always the best option; especially for shallow networks, alternatively fine-tuning the top blocks can save both time and computational power and produce more robust classifiers.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-21
Author(s):  
Alejandro Moreo ◽  
Andrea Esuli ◽  
Fabrizio Sebastiani

Obtaining high-quality labelled data for training a classifier in a new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, to the “target” domain of interest, knowledge available from a different “source” domain. In transfer learning the lack of labelled information from the target domain is compensated by the availability at training time of a set of unlabelled examples from the target distribution. Transductive Transfer Learning denotes the transfer learning setting in which the only set of target documents that we are interested in classifying is known and available at training time. Although this definition is indeed in line with Vapnik’s original definition of “transduction”, current terminology in the field is confused. In this article, we discuss how the term “transduction” has been misused in the transfer learning literature, and propose a clarification consistent with the original characterization of this term given by Vapnik. We go on to observe that the above terminology misuse has brought about misleading experimental comparisons, with inductive transfer learning methods that have been incorrectly compared with transductive transfer learning methods. We then, give empirical evidence that the difference in performance between the inductive version and the transductive version of a transfer learning method can indeed be statistically significant (i.e., that knowing at training time the only data one needs to classify indeed gives an advantage). Our clarification allows a reassessment of the field, and of the relative merits of the major, state-of-the-art algorithms for transfer learning in text classification.


2021 ◽  
pp. 016555152199061
Author(s):  
Salima Lamsiyah ◽  
Abdelkader El Mahdaouy ◽  
Saïd El Alaoui Ouatik ◽  
Bernard Espinasse

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.


Sign in / Sign up

Export Citation Format

Share Document