scholarly journals Deep Transfer Learning for Source Code Modeling

Author(s):  
Yasir Hussain ◽  
Zhiqiu Huang ◽  
Yu Zhou ◽  
Senzhang Wang

In recent years, deep learning models have shown great potential in source code modeling and analysis. Generally, deep learning-based approaches are problem-specific and data-hungry. A challenging issue of these approaches is that they require training from scratch for a different related problem. In this work, we propose a transfer learning-based approach that significantly improves the performance of deep learning-based source code models. In contrast to traditional learning paradigms, transfer learning can transfer the knowledge learned in solving one problem into another related problem. First, we present two recurrent neural network-based models RNN and GRU for the purpose of transfer learning in the domain of source code modeling. Next, via transfer learning, these pre-trained (RNN and GRU) models are used as feature extractors. Then, these extracted features are combined into attention learner for different downstream tasks. The attention learner leverages from the learned knowledge of pre-trained models and fine-tunes them for a specific downstream task. We evaluate the performance of the proposed approach with extensive experiments with the source code suggestion task. The results indicate that the proposed approach outperforms the state-of-the-art models in terms of accuracy, precision, recall and F-measure without training the models from scratch.

Author(s):  
Niccolò Marastoni ◽  
Roberto Giacobazzi ◽  
Mila Dalla Preda

AbstractIn the past few years, malware classification techniques have shifted from shallow traditional machine learning models to deeper neural network architectures. The main benefit of some of these is the ability to work with raw data, guaranteed by their automatic feature extraction capabilities. This results in less technical expertise needed while building the models, thus less initial pre-processing resources. Nevertheless, such advantage comes with its drawbacks, since deep learning models require huge quantities of data in order to generate a model that generalizes well. The amount of data required to train a deep network without overfitting is often unobtainable for malware analysts. We take inspiration from image-based data augmentation techniques and apply a sequence of semantics-preserving syntactic code transformations (obfuscations) to a small dataset of programs to generate a larger dataset. We then design two learning models, a convolutional neural network and a bi-directional long short-term memory, and we train them on images extracted from compiled binaries of the newly generated dataset. Through transfer learning we then take the features learned from the obfuscated binaries and train the models against two state of the art malware datasets, each containing around 10 000 samples. Our models easily achieve up to 98.5% accuracy on the test set, which is on par or better than the present state of the art approaches, thus validating the approach.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2020 ◽  
Vol 12 (10) ◽  
pp. 1581 ◽  
Author(s):  
Daniel Perez ◽  
Kazi Islam ◽  
Victoria Hill ◽  
Richard Zimmerman ◽  
Blake Schaeffer ◽  
...  

Coastal ecosystems are critically affected by seagrass, both economically and ecologically. However, reliable seagrass distribution information is lacking in nearly all parts of the world because of the excessive costs associated with its assessment. In this paper, we develop two deep learning models for automatic seagrass distribution quantification based on 8-band satellite imagery. Specifically, we implemented a deep capsule network (DCN) and a deep convolutional neural network (CNN) to assess seagrass distribution through regression. The DCN model first determines whether seagrass is presented in the image through classification. Second, if seagrass is presented in the image, it quantifies the seagrass through regression. During training, the regression and classification modules are jointly optimized to achieve end-to-end learning. The CNN model is strictly trained for regression in seagrass and non-seagrass patches. In addition, we propose a transfer learning approach to transfer knowledge in the trained deep models at one location to perform seagrass quantification at a different location. We evaluate the proposed methods in three WorldView-2 satellite images taken from the coastal area in Florida. Experimental results show that the proposed deep DCN and CNN models performed similarly and achieved much better results than a linear regression model and a support vector machine. We also demonstrate that using transfer learning techniques for the quantification of seagrass significantly improved the results as compared to directly applying the deep models to new locations.


2020 ◽  
Vol 3 (2) ◽  
pp. 177-178
Author(s):  
John Jowil D. Orquia ◽  
El Jireh Bibangco

Manual Fruit classification is the traditional way of classifying fruits. It is manual contact-labor that is time-consuming and often results in lesser productivity, inconsistency, and sometimes damaging the fruits (Prabha & Kumar, 2012). Thus, new technologies such as deep learning paved the way for a faster and more efficient method of fruit classification (Faridi & Aboonajmi, 2017). A deep convolutional neural network, or deep learning, is a machine learning algorithm that contains several layers of neural networks stacked together to create a more complex model capable of solving complex problems. The utilization of state-of-the-art pre-trained deep learning models such as AlexNet, GoogLeNet, and ResNet-50 was widely used. However, such models were not explicitly trained for fruit classification (Dyrmann, Karstoft, & Midtiby, 2016). The study aimed to create a new deep convolutional neural network and compared its performance to fine-tuned models based on accuracy, precision, sensitivity, and specificity.


2019 ◽  
Vol 9 (10) ◽  
pp. 2138 ◽  
Author(s):  
Cong Pan ◽  
Minyan Lu ◽  
Biao Xu ◽  
Houleng Gao

To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012078
Author(s):  
Pallavi R Mane ◽  
Rajat Shenoy ◽  
Ghanashyama Prabhu

Abstract COVID -19, is a deadly, dangerous and contagious disease caused by the novel corona virus. It is very important to detect COVID-19 infection accurately as quickly as possible to avoid the spreading. Deep learning methods can significantly improve the efficiency and accuracy of reading Chest X-Rays (CXRs). The existing Deep learning models with further fine tune provide cost effective, rapid, and better classification results. This paper tries to deploy well studied AI tools with modification on X-ray images to classify COVID 19. This research performs five experiments to classify COVID-19 CXRs from Normal and Viral Pneumonia CXRs using Convolutional Neural Networks (CNN). Four experiments were performed on state-of-the-art pre-trained models using transfer learning and one experiment was performed using a CNN designed from scratch. Dataset used for the experiments consists of chest X-Ray images from the Kaggle dataset and other publicly accessible sources. The data was split into three parts while 90% retained for training the models, 5% each was used in validation and testing of the constructed models. The four transfer learning models used were Inception, Xception, ResNet, and VGG19, that resulted in the test accuracies of 93.07%, 94.8%, 67.5%, and 91.1% respectively and our CNN model resulted in 94.6%.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


2020 ◽  
Author(s):  
Pathikkumar Patel ◽  
Bhargav Lad ◽  
Jinan Fiaidhi

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.


Sign in / Sign up

Export Citation Format

Share Document