scholarly journals Is One Teacher Model Enough to Transfer Knowledge to a Student Model?

Algorithms ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 334
Author(s):  
Nicola Landro ◽  
Ignazio Gallo ◽  
Riccardo La Grassa

Nowadays, the transfer learning technique can be successfully applied in the deep learning field through techniques that fine-tune the CNN’s starting point so it may learn over a huge dataset such as ImageNet and continue to learn on a fixed dataset to achieve better performance. In this paper, we designed a transfer learning methodology that combines the learned features of different teachers to a student network in an end-to-end model, improving the performance of the student network in classification tasks over different datasets. In addition to this, we tried to answer the following questions which are in any case directly related to the transfer learning problem addressed here. Is it possible to improve the performance of a small neural network by using the knowledge gained from a more powerful neural network? Can a deep neural network outperform the teacher using transfer learning? Experimental results suggest that neural networks can transfer their learning to student networks using our proposed architecture, designed to bring to light a new interesting approach for transfer learning techniques. Finally, we provide details of the code and the experimental settings.

Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 496
Author(s):  
Hyunjae Lee ◽  
Eun Young Seo ◽  
Hyosang Ju ◽  
Sang-Hyo Kim

Neural network decoders (NNDs) for rate-compatible polar codes are studied in this paper. We consider a family of rate-compatible polar codes which are constructed from a single polar coding sequence as defined by 5G new radios. We propose a transfer learning technique for training multiple NNDs of the rate-compatible polar codes utilizing their inclusion property. The trained NND for a low rate code is taken as the initial state of NND training for the next smallest rate code. The proposed method provides quicker training as compared to separate learning of the NNDs according to numerical results. We additionally show that an underfitting problem of NND training due to low model complexity can be solved by transfer learning techniques.


Author(s):  
Ali Fakhry

The applications of Deep Q-Networks are seen throughout the field of reinforcement learning, a large subsect of machine learning. Using a classic environment from OpenAI, CarRacing-v0, a 2D car racing environment, alongside a custom based modification of the environment, a DQN, Deep Q-Network, was created to solve both the classic and custom environments. The environments are tested using custom made CNN architectures and applying transfer learning from Resnet18. While DQNs were state of the art years ago, using it for CarRacing-v0 appears somewhat unappealing and not as effective as other reinforcement learning techniques. Overall, while the model did train and the agent learned various parts of the environment, attempting to reach the reward threshold for the environment with this reinforcement learning technique seems problematic and difficult as other techniques would be more useful.


2018 ◽  
Vol 5 (2) ◽  
pp. 145-156 ◽  
Author(s):  
Taposh Kumar Neogy ◽  
Naresh Babu Bynagari

In machine learning, the transition from hand-designed features to learned features has been a huge success. Regardless, optimization methods are still created by hand. In this study, we illustrate how an optimization method's design can be recast as a learning problem, allowing the algorithm to automatically learn to exploit structure in the problems of interest. On the tasks for which they are taught, our learning algorithms, implemented by LSTMs, beat generic, hand-designed competitors, and they also adapt well to other challenges with comparable structure. We show this on a variety of tasks, including simple convex problems, neural network training, and visual styling with neural art.  


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jin-Woong Lee ◽  
Woon Bae Park ◽  
Jin Hee Lee ◽  
Satendra Pal Singh ◽  
Kee-Sun Sohn

AbstractHere we report a facile, prompt protocol based on deep-learning techniques to sort out intricate phase identification and quantification problems in complex multiphase inorganic compounds. We simulate plausible powder X-ray diffraction (XRD) patterns for 170 inorganic compounds in the Sr-Li-Al-O quaternary compositional pool, wherein promising LED phosphors have been recently discovered. Finally, 1,785,405 synthetic XRD patterns are prepared by combinatorically mixing the simulated powder XRD patterns of 170 inorganic compounds. Convolutional neural network (CNN) models are built and eventually trained using this large prepared dataset. The fully trained CNN model promptly and accurately identifies the constituent phases in complex multiphase inorganic compounds. Although the CNN is trained using the simulated XRD data, a test with real experimental XRD data returns an accuracy of nearly 100% for phase identification and 86% for three-step-phase-fraction quantification.


Processes ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 2029
Author(s):  
Yan-Kai Chen ◽  
Steven Shave ◽  
Manfred Auer

Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of high quality, abundant training data for machine learning methods can be a major limiting factor in building effective property predictors. We utilize transfer learning techniques to get around this problem, first learning on a large amount of low accuracy predicted logP values before finally tuning our model using a small, accurate dataset of 244 druglike compounds to create MRlogP, a neural network-based predictor of logP capable of outperforming state of the art freely available logP prediction methods for druglike small molecules. MRlogP achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP. We have made the trained neural network predictor and all associated code for descriptor generation freely available. In addition, MRlogP may be used online via a web interface.


Author(s):  
Justin S Smith ◽  
Benjamin T. Nebgen ◽  
Roman Zubatyuk ◽  
Nicholas Lubbers ◽  
Christian Devereux ◽  
...  

<p>Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist's toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields are cheap and scalable, but lack transferability to new systems. Machine learning can be used to achieve the best of both approaches. Here we train a general-purpose neural network potential (ANI-1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology and chemistry, and billions of times faster<i></i>than CCSD(T)/CBS calculations. </p>


2021 ◽  
Author(s):  
Ghassan Mohammed Halawani

The main purpose of this project is to modify a convolutional neural network for image classification, based on a deep-learning framework. A transfer learning technique is used by the MATLAB interface to Alex-Net to train and modify the parameters in the last two fully connected layers of Alex-Net with a new dataset to perform classifications of thousands of images. First, the general common architecture of most neural networks and their benefits are presented. The mathematical models and the role of each part in the neural network are explained in detail. Second, different neural networks are studied in terms of architecture, application, and the working method to highlight the strengths and weaknesses of each of neural network. The final part conducts a detailed study on one of the most powerful deep-learning networks in image classification – i.e. the convolutional neural network – and how it can be modified to suit different classification tasks by using transfer learning technique in MATLAB.


Author(s):  
Jaisakthi Seetharani Murugaiyan ◽  
Mirunalini Palaniappan ◽  
Thenmozhi Durairaj ◽  
Vigneshkumar Muthukumar

Marine species recognition is the process of identifying various species that help in population estimation and identifying the endangered types for taking further remedies and actions. The superior performance of deep learning for classification is due to the property of estimating millions of parameters that have to be extracted from many annotated datasets. However, many types of fish species are becoming extinct, which may reduce the number of samples. The unavailability of a large dataset is a significant hurdle for applying a deep neural network that can be overcome using transfer learning techniques. To overcome this problem, we propose a transfer learning technique using a pre-trained model that uses underwater fish images as input and applies a transfer learning technique to detect the fish species using a pre-trained Google Inception-v3 model. We have evaluated our proposed method on the Fish4knowledge(F4K) dataset and obtained an accuracy of 95.37%. The research would be helpful to identify fish existence and quantity for marine biologists to understand the underwater environment to encourage its preservation and study the behavior and interactions of marine animals.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1879
Author(s):  
Zahid Ali Siddiqui ◽  
Unsang Park

In this paper, we present a novel incremental learning technique to solve the catastrophic forgetting problem observed in the CNN architectures. We used a progressive deep neural network to incrementally learn new classes while keeping the performance of the network unchanged on old classes. The incremental training requires us to train the network only for new classes and fine-tune the final fully connected layer, without needing to train the entire network again, which significantly reduces the training time. We evaluate the proposed architecture extensively on image classification task using Fashion MNIST, CIFAR-100 and ImageNet-1000 datasets. Experimental results show that the proposed network architecture not only alleviates catastrophic forgetting but can also leverages prior knowledge via lateral connections to previously learned classes and their features. In addition, the proposed scheme is easily scalable and does not require structural changes on the network trained on the old task, which are highly required properties in embedded systems.


2021 ◽  
Vol 9 (2) ◽  
pp. 211
Author(s):  
Faisal Dharma Adhinata ◽  
Gita Fadila Fitriana ◽  
Aditya Wijayanto ◽  
Muhammad Pajar Kharisma Putra

Indonesia is an agricultural country with abundant agricultural products. One of the crops used as a staple food for Indonesians is corn. This corn plant must be protected from diseases so that the quality of corn harvest can be optimal. Early detection of disease in corn plants is needed so that farmers can provide treatment quickly and precisely. Previous research used machine learning techniques to solve this problem. The results of the previous research were not optimal because the amount of data used was slightly and less varied. Therefore, we propose a technique that can process lots and varied data, hoping that the resulting system is more accurate than the previous research. This research uses transfer learning techniques as feature extraction combined with Convolutional Neural Network as a classification. We analysed the combination of DenseNet201 with a Flatten or Global Average Pooling layer. The experimental results show that the accuracy produced by the combination of DenseNet201 with the Global Average Pooling layer is better than DenseNet201 with Flatten layer. The accuracy obtained is 93% which proves the proposed system is more accurate than previous studies.


Sign in / Sign up

Export Citation Format

Share Document