scholarly journals Application of Random Region Augmentation Algorithm in Deep Learning

2021 ◽  
Vol 2078 (1) ◽  
pp. 012001
Author(s):  
Tao Chen ◽  
Hongying Lu ◽  
Sihe Xiao

Abstract In the field of computer vision, the collection and sorting of image data is the core driving force. However, the current data collection work cannot perfectly collect the image data of each actual landing scene. The purpose of the data augmentation algorithm is to increase the diversity of the data set and improve the robustness of the model. Traditional data augmentation methods include geometric augmentation and color augmentation, mainly including flipping, rotating, cropping, translation, stretching, zooming, adding noise, blurring, Dropout, Cutout, color jittering. Traditional data augmentation methods have certain limitations, and the effect is not obvious. Based on the idea of Cutout algorithm, this paper proposes the RRA augmentation algorithm, which divides four quadrant regions in the image, and randomly selects the ROI region in each region, and is different from the Cutout algorithm directly discarding the region, but randomizing the region Enhance the color, and finally do geometric augmentation processing on the overall image. Compared with the original single data augmentation operation, the algorithm improves precision by 7%, and recall improves by 7%.

2016 ◽  
Vol 22 (1) ◽  
pp. 102-107 ◽  
Author(s):  
Omer Ishaq ◽  
Sajith Kecheril Sadanandan ◽  
Carolina Wählby

Zebrafish ( Danio rerio) is an important vertebrate model organism in biomedical research, especially suitable for morphological screening due to its transparent body during early development. Deep learning has emerged as a dominant paradigm for data analysis and found a number of applications in computer vision and image analysis. Here we demonstrate the potential of a deep learning approach for accurate high-throughput classification of whole-body zebrafish deformations in multifish microwell plates. Deep learning uses the raw image data as an input, without the need of expert knowledge for feature design or optimization of the segmentation parameters. We trained the deep learning classifier on as few as 84 images (before data augmentation) and achieved a classification accuracy of 92.8% on an unseen test data set that is comparable to the previous state of the art (95%) based on user-specified segmentation and deformation metrics. Ablation studies by digitally removing whole fish or parts of the fish from the images revealed that the classifier learned discriminative features from the image foreground, and we observed that the deformations of the head region, rather than the visually apparent bent tail, were more important for good classification performance.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Author(s):  
Shaoqiang Wang ◽  
Shudong Wang ◽  
Song Zhang ◽  
Yifan Wang

Abstract To automatically detect dynamic EEG signals to reduce the time cost of epilepsy diagnosis. In the signal recognition of electroencephalogram (EEG) of epilepsy, traditional machine learning and statistical methods require manual feature labeling engineering in order to show excellent results on a single data set. And the artificially selected features may carry a bias, and cannot guarantee the validity and expansibility in real-world data. In practical applications, deep learning methods can release people from feature engineering to a certain extent. As long as the focus is on the expansion of data quality and quantity, the algorithm model can learn automatically to get better improvements. In addition, the deep learning method can also extract many features that are difficult for humans to perceive, thereby making the algorithm more robust. Based on the design idea of ResNeXt deep neural network, this paper designs a Time-ResNeXt network structure suitable for time series EEG epilepsy detection to identify EEG signals. The accuracy rate of Time-ResNeXt in the detection of EEG epilepsy can reach 91.50%. The Time-ResNeXt network structure produces extremely advanced performance on the benchmark dataset (Berne-Barcelona dataset) and has great potential for improving clinical practice.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


Geofluids ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Dongsheng Wang ◽  
Jun Feng ◽  
Xinpeng Zhao ◽  
Yeping Bai ◽  
Yujie Wang ◽  
...  

It is difficult to form a method for recognizing the degree of infiltration of a tunnel lining. To solve this problem, we propose a recognition method by using a deep convolutional neural network. We carry out laboratory tests, prepare cement mortar specimens with different saturation levels, simulate different degrees of infiltration of tunnel concrete linings, and establish an infrared thermal image data set with different degrees of infiltration. Then, based on a deep learning method, the data set is trained using the Faster R-CNN+ResNet101 network, and a recognition model is established. The experiments show that the recognition model established by the deep learning method can be used to select cement mortar specimens with different degrees of infiltration by using an accurately minimized rectangular outer frame. This model shows that the classification recognition model for tunnel concrete lining infiltration established by the indoor experimental method has high recognition accuracy.


2019 ◽  
Vol 109 (6) ◽  
pp. 1083-1087 ◽  
Author(s):  
Dor Oppenheim ◽  
Guy Shani ◽  
Orly Erlich ◽  
Leah Tsror

Many plant diseases have distinct visual symptoms, which can be used to identify and classify them correctly. This article presents a potato disease classification algorithm that leverages these distinct appearances and advances in computer vision made possible by deep learning. The algorithm uses a deep convolutional neural network, training it to classify the tubers into five classes: namely, four disease classes and a healthy potato class. The database of images used in this study, containing potato tubers of different cultivars, sizes, and diseases, was acquired, classified, and labeled manually by experts. The models were trained over different train-test splits to better understand the amount of image data needed to apply deep learning for such classification tasks. The models were tested over a data set of images taken using standard low-cost RGB (red, green, and blue) sensors and were tagged by experts, demonstrating high classification accuracy. This is the first article to report the successful implementation of deep convolutional networks, popular in object identification, to the task of disease identification in potato tubers, showing the potential of deep learning techniques in agricultural tasks.


Diagnostics ◽  
2020 ◽  
Vol 10 (5) ◽  
pp. 261
Author(s):  
Tae-Young Heo ◽  
Kyoung Min Kim ◽  
Hyun Kyu Min ◽  
Sun Mi Gu ◽  
Jae Hyun Kim ◽  
...  

The use of deep-learning-based artificial intelligence (AI) is emerging in ophthalmology, with AI-mediated differential diagnosis of neovascular age-related macular degeneration (AMD) and dry AMD a promising methodology for precise treatment strategies and prognosis. Here, we developed deep learning algorithms and predicted diseases using 399 images of fundus. Based on feature extraction and classification with fully connected layers, we applied the Visual Geometry Group with 16 layers (VGG16) model of convolutional neural networks to classify new images. Image-data augmentation in our model was performed using Keras ImageDataGenerator, and the leave-one-out procedure was used for model cross-validation. The prediction and validation results obtained using the AI AMD diagnosis model showed relevant performance and suitability as well as better diagnostic accuracy than manual review by first-year residents. These results suggest the efficacy of this tool for early differential diagnosis of AMD in situations involving shortages of ophthalmology specialists and other medical devices.


Sign in / Sign up

Export Citation Format

Share Document