Data augmentation and transfer learning strategies for reaction prediction in low chemical data regimes

Data Augmentation and Transfer Learning Strategies for Reaction Prediction in Low Chemical Data Regimes

10.26434/chemrxiv.13383275 ◽

2020 ◽

Author(s):

Yun Zhang ◽

Ling Wang ◽

Xinqiao Wang ◽

Chengyun Zhang ◽

Jiamin Ge ◽

...

Keyword(s):

Deep Learning ◽

Learning Strategies ◽

Transfer Learning ◽

Data Augmentation ◽

Learning Strategy ◽

Predictive Performance ◽

Learning Model ◽

Training Data ◽

Reaction Prediction ◽

Transformer Model

Abstract: Effective and rapid deep learning method to predict chemical reactions contributes to the research and development of organic chemistry and drug discovery. Despite the outstanding capability of deep learning in retrosynthesis and forward synthesis, predictions based on small chemical datasets generally result in low accuracy due to an insufficiency of reaction examples. Here, we introduce a new state art of method, which integrates transfer learning with transformer model to predict the outcomes of the Baeyer-Villiger reaction which is a representative small dataset reaction. The results demonstrate that introducing transfer learning strategy markedly improves the top-1 accuracy of the transformer-transfer learning model (81.8%) over that of the transformer-baseline model (58.4%). Moreover, we further introduce data augmentation to the input reaction SMILES, which allows for better performance and improves the accuracy of the transformer-transfer learning model (86.7%). In summary, both transfer learning and data augmentation methods significantly improve the predictive performance of transformer model, which are powerful methods used in chemistry field to eliminate the restriction of limited training data.

Download Full-text

Data Augmentation and Transfer Learning Strategies for Reaction Prediction in Low Chemical Data Regimes

10.26434/chemrxiv.13383275.v1 ◽

2020 ◽

Author(s):

Yun Zhang ◽

Ling Wang ◽

Xinqiao Wang ◽

Chengyun Zhang ◽

Jiamin Ge ◽

...

Keyword(s):

Deep Learning ◽

Learning Strategies ◽

Transfer Learning ◽

Data Augmentation ◽

Learning Strategy ◽

Predictive Performance ◽

Learning Model ◽

Training Data ◽

Reaction Prediction ◽

Transformer Model

Abstract: Effective and rapid deep learning method to predict chemical reactions contributes to the research and development of organic chemistry and drug discovery. Despite the outstanding capability of deep learning in retrosynthesis and forward synthesis, predictions based on small chemical datasets generally result in low accuracy due to an insufficiency of reaction examples. Here, we introduce a new state art of method, which integrates transfer learning with transformer model to predict the outcomes of the Baeyer-Villiger reaction which is a representative small dataset reaction. The results demonstrate that introducing transfer learning strategy markedly improves the top-1 accuracy of the transformer-transfer learning model (81.8%) over that of the transformer-baseline model (58.4%). Moreover, we further introduce data augmentation to the input reaction SMILES, which allows for better performance and improves the accuracy of the transformer-transfer learning model (86.7%). In summary, both transfer learning and data augmentation methods significantly improve the predictive performance of transformer model, which are powerful methods used in chemistry field to eliminate the restriction of limited training data.

Download Full-text

Olympic Games Event Recognition via Transfer Learning with Photobombing Guided Data Augmentation

Journal of Imaging ◽

10.3390/jimaging7020012 ◽

2021 ◽

Vol 7 (2) ◽

pp. 12

Author(s):

Yousef I. Mohamad ◽

Samah S. Baraheem ◽

Tam V. Nguyen

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Data Augmentation ◽

Olympic Games ◽

Event Recognition ◽

Surveillance Systems ◽

Video Captioning ◽

Practical Applications ◽

Sport Events ◽

The Olympic Games

Automatic event recognition in sports photos is both an interesting and valuable research topic in the field of computer vision and deep learning. With the rapid increase and the explosive spread of data, which is being captured momentarily, the need for fast and precise access to the right information has become a challenging task with considerable importance for multiple practical applications, i.e., sports image and video search, sport data analysis, healthcare monitoring applications, monitoring and surveillance systems for indoor and outdoor activities, and video captioning. In this paper, we evaluate different deep learning models in recognizing and interpreting the sport events in the Olympic Games. To this end, we collect a dataset dubbed Olympic Games Event Image Dataset (OGED) including 10 different sport events scheduled for the Olympic Games Tokyo 2020. Then, the transfer learning is applied on three popular deep convolutional neural network architectures, namely, AlexNet, VGG-16 and ResNet-50 along with various data augmentation methods. Extensive experiments show that ResNet-50 with the proposed photobombing guided data augmentation achieves 90% in terms of accuracy.

Download Full-text

A deep learning method for extensible microstructural quantification of DP steel enhanced by physical metallurgy-guided data augmentation

Materials Characterization ◽

10.1016/j.matchar.2021.111392 ◽

2021 ◽

pp. 111392

Author(s):

Chunguang Shen ◽

Xiaolu Wei ◽

Chenchong Wang ◽

Wei Xu

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Physical Metallurgy ◽

Learning Method ◽

Dp Steel

Download Full-text

Comparative Analysis on Deep Learning Approaches for Heavy-Vehicle Detection based on Data Augmentation and Transfer-Learning techniques

Journal of Scientific Research ◽

10.3329/jsr.v13i3.52332 ◽

2021 ◽

Vol 13 (3) ◽

pp. 809-820

Author(s):

V. Sowmya ◽

R. Radha

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Real Time ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Traffic Management ◽

Data Augmentation ◽

Vehicle Detection ◽

Heavy Vehicles ◽

Detection And Recognition

Vehicle detection and recognition require demanding advanced computational intelligence and resources in a real-time traffic surveillance system for effective traffic management of all possible contingencies. One of the focus areas of deep intelligent systems is to facilitate vehicle detection and recognition techniques for robust traffic management of heavy vehicles. The following are such sophisticated mechanisms: Support Vector Machine (SVM), Convolutional Neural Networks (CNN), Regional Convolutional Neural Networks (R-CNN), You Only Look Once (YOLO) model, etcetera. Accordingly, it is pivotal to choose the precise algorithm for vehicle detection and recognition, which also addresses the real-time environment. In this study, a comparison of deep learning algorithms, such as the Faster R-CNN, YOLOv2, YOLOv3, and YOLOv4, are focused on diverse aspects of the features. Two entities for transport heavy vehicles, the buses and trucks, constitute detection and recognition elements in this proposed work. The mechanics of data augmentation and transfer-learning is implemented in the model; to build, execute, train, and test for detection and recognition to avoid over-fitting and improve speed and accuracy. Extensive empirical evaluation is conducted on two standard datasets such as COCO and PASCAL VOC 2007. Finally, comparative results and analyses are presented based on real-time.

Download Full-text

An Empirical Study of Deep Learning Frameworks for Melanoma Cancer Detection using Transfer Learning and Data Augmentation

10.1109/ickg52313.2021.00015 ◽

2021 ◽

Author(s):

Divya Gangwani ◽

Qianxin Liang ◽

Shuwen Wang ◽

Xingquan Zhu

Keyword(s):

Deep Learning ◽

Empirical Study ◽

Transfer Learning ◽

Cancer Detection ◽

Data Augmentation ◽

Melanoma Cancer ◽

Learning Frameworks

Download Full-text

Oversampling Based on Data Augmentation in Convolutional Neural Network for Silicon Wafer Defect Classification

Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200547 ◽

2020 ◽

Author(s):

Uzma Batool ◽

Mohd Ibrahim Shapiai ◽

Nordinah Ismail ◽

Hilman Fauzi ◽

Syahrizal Salleh

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Silicon Wafer ◽

Data Augmentation ◽

Imbalanced Data ◽

Training Data ◽

Defect Classification ◽

Learning Method ◽

Test Set

Silicon wafer defect data collected from fabrication facilities is intrinsically imbalanced because of the variable frequencies of defect types. Frequently occurring types will have more influence on the classification predictions if a model gets trained on such skewed data. A fair classifier for such imbalanced data requires a mechanism to deal with type imbalance in order to avoid biased results. This study has proposed a convolutional neural network for wafer map defect classification, employing oversampling as an imbalance addressing technique. To have an equal participation of all classes in the classifier’s training, data augmentation has been employed, generating more samples in minor classes. The proposed deep learning method has been evaluated on a real wafer map defect dataset and its classification results on the test set returned a 97.91% accuracy. The results were compared with another deep learning based auto-encoder model demonstrating the proposed method, a potential approach for silicon wafer defect classification that needs to be investigated further for its robustness.

Download Full-text

A Joint Back-Translation and Transfer Learning Method for Low-Resource Neural Machine Translation

Mathematical Problems in Engineering ◽

10.1155/2020/6140153 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Gong-Xu Luo ◽

Ya-Ting Yang ◽

Rui Dong ◽

Yan-Hong Chen ◽

Wen-Bo Zhang

Keyword(s):

Machine Translation ◽

Transfer Learning ◽

Large Scale ◽

Data Augmentation ◽

Training Methods ◽

Learning Method ◽

Neural Machine Translation ◽

Low Resource ◽

Parallel Data ◽

Back Translation

Neural machine translation (NMT) for low-resource languages has drawn great attention in recent years. In this paper, we propose a joint back-translation and transfer learning method for low-resource languages. It is widely recognized that data augmentation methods and transfer learning methods are both straight forward and effective ways for low-resource problems. However, existing methods, which utilize one of these methods alone, limit the capacity of NMT models for low-resource problems. In order to make full use of the advantages of existing methods and further improve the translation performance of low-resource languages, we propose a new method to perfectly integrate the back-translation method with mainstream transfer learning architectures, which can not only initialize the NMT model by transferring parameters of the pretrained models, but also generate synthetic parallel data by translating large-scale monolingual data of the target side to boost the fluency of translations. We conduct experiments to explore the effectiveness of the joint method by incorporating back-translation into the parent-child and the hierarchical transfer learning architecture. In addition, different preprocessing and training methods are explored to get better performance. Experimental results on Uygur-Chinese and Turkish-English translation demonstrate the superiority of the proposed method over the baselines that use single methods.

Download Full-text

DL-CRISPR: A Deep Learning Method for Off-Target Activity Prediction in CRISPR/Cas9 With Data Augmentation

IEEE Access ◽

10.1109/access.2020.2989454 ◽

2020 ◽

Vol 8 ◽

pp. 76610-76617

Author(s):

Yu Zhang ◽

Yahui Long ◽

Rui Yin ◽

Chee Keong Kwoh

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Learning Method ◽

Activity Prediction ◽

Target Activity

Download Full-text

Evaluation of Scalability and Degree of Fine-Tuning of Deep Convolutional Neural Networks for COVID-19 Screening on Chest X-ray Images Using Explainable Deep-Learning Algorithm

Journal of Personalized Medicine ◽

10.3390/jpm10040213 ◽

2020 ◽

Vol 10 (4) ◽

pp. 213 ◽

Cited By ~ 1

Author(s):

Ki-Sun Lee ◽

Jae Young Kim ◽

Eun-tae Jeon ◽

Won Suk Choi ◽

Nan Hee Kim ◽

...

Keyword(s):

Deep Learning ◽

Learning Strategies ◽

Transfer Learning ◽

Learning Algorithm ◽

Fine Tuning ◽

Deep Convolutional Neural Networks ◽

X Ray ◽

Backbone Network ◽

Deep Learning Algorithm ◽

Chest X Ray

According to recent studies, patients with COVID-19 have different feature characteristics on chest X-ray (CXR) than those with other lung diseases. This study aimed at evaluating the layer depths and degree of fine-tuning on transfer learning with a deep convolutional neural network (CNN)-based COVID-19 screening in CXR to identify efficient transfer learning strategies. The CXR images used in this study were collected from publicly available repositories, and the collected images were classified into three classes: COVID-19, pneumonia, and normal. To evaluate the effect of layer depths of the same CNN architecture, CNNs called VGG-16 and VGG-19 were used as backbone networks. Then, each backbone network was trained with different degrees of fine-tuning and comparatively evaluated. The experimental results showed the highest AUC value to be 0.950 concerning COVID-19 classification in the experimental group of a fine-tuned with only 2/5 blocks of the VGG16 backbone network. In conclusion, in the classification of medical images with a limited number of data, a deeper layer depth may not guarantee better results. In addition, even if the same pre-trained CNN architecture is used, an appropriate degree of fine-tuning can help to build an efficient deep learning model.

Download Full-text