scholarly journals Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

2016 ◽  
Vol 2016 ◽  
pp. 1-15 ◽  
Author(s):  
Benjamin Chandler ◽  
Ennio Mingolla

Heavily occluded objects are more difficult for classification algorithms to identify correctly than unoccluded objects. This effect is rare and thus hard to measure with datasets like ImageNet and PASCAL VOC, however, owing to biases in human-generated image pose selection. We introduce a dataset that emphasizes occlusion and additions to a standard convolutional neural network aimed at increasing invariance to occlusion. An unmodified convolutional neural network trained and tested on the new dataset rapidly degrades to chance-level accuracy as occlusion increases. Training with occluded data slows this decline but still yields poor performance with high occlusion. Integrating novel preprocessing stages to segment the input and inpaint occlusions is an effective mitigation. A convolutional network so modified is nearly as effective with more than 81% of pixels occluded as it is with no occlusion. Such a network is also more accurate on unoccluded images than an otherwise identical network that has been trained with only unoccluded images. These results depend on successful segmentation. The occlusions in our dataset are deliberately easy to segment from the figure and background. Achieving similar results on a more challenging dataset would require finding a method to split figure, background, and occluding pixels in the input.


2017 ◽  
Author(s):  
◽  
Shuxian Shen

Convolutional Neural Networks (CNN) are a popular neural network structure for image based applications. This thesis discusses an alternative network, the morphological shared-weight neural network (MSNN) for object detection. In this thesis, three combined network structures are developed for multi-scale object detection. The dataset used for the experiments presented here were created by the author for this thesis study. The convolutional neural network is used as the baseline for judging the performance of the MSNN. Experiments suggest that when training data is limited, the MSNN has a more robust and precise performance as compared with the CNN.


2018 ◽  
Author(s):  
◽  
Zhi Zhang

Despite being a core topic for more than several decades, object detection is still receiving increasing attentions due to its irreplaceable importance in a wide variety of applications. Abundant object detectors based on deep neural networks have shown significantly revamped accuracies in recent years. However, it's still the day one for these models to be effectively deployed to real world. In this dissertation, we focus on object detection models which tackle real world problems that are unavailable few years ago. We also aim at making object detectors on the go, which means detectors are not longer required to be run on workstations and cloud services which is latency unfriendly. To achieve these goals, we addressed the problem in two phases: application and deployment. We have done thoughtful research on both areas. Our contribution involves inter-frame information fusing, model knowledge distillation, advanced model flow control for progressive inference, and hardware oriented model design and optimization. More specifically, we proposed a novel cross-frame verification scheme for spatial temporal fused object detection model for sequential images and videos in a proposal and reject favor. To compress model from a learning basis and resolve domain specific training data shortage, we improved the learning algorithm to handle insufficient labeled data by searching for optimal guidance paths from pre-trained models. To further reduce model inference cost, we designed a progressive neural network which run in flexible cost enabled by RNN style decision controller during runtime. We recognize the awkward model deployment problem, especially for object detection models that require excessive customized layers. In response, we propose to use end-to-end neural network which use pure neural network components to substitute traditional post-processing operations. We also applied operator decomposition and graph level and on-device optimization towards real-time object detection on low power edge devices. All these works have achieved state-of-the-art performances and converted to successful applications.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


Author(s):  
E. Yu. Shchetinin

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Pei Yang ◽  
Yong Pi ◽  
Tao He ◽  
Jiangming Sun ◽  
Jianan Wei ◽  
...  

Abstract Background 99mTc-pertechnetate thyroid scintigraphy is a valid complementary avenue for evaluating thyroid disease in the clinic, the image feature of thyroid scintigram is relatively simple but the interpretation still has a moderate consistency among physicians. Thus, we aimed to develop an artificial intelligence (AI) system to automatically classify the four patterns of thyroid scintigram. Methods We collected 3087 thyroid scintigrams from center 1 to construct the training dataset (n = 2468) and internal validating dataset (n = 619), and another 302 cases from center 2 as external validating datasets. Four pre-trained neural networks that included ResNet50, DenseNet169, InceptionV3, and InceptionResNetV2 were implemented to construct AI models. The models were trained separately with transfer learning. We evaluated each model’s performance with metrics as following: accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), recall, precision, and F1-score. Results The overall accuracy of four pre-trained neural networks in classifying four common uptake patterns of thyroid scintigrams all exceeded 90%, and the InceptionV3 stands out from others. It reached the highest performance with an overall accuracy of 92.73% for internal validation and 87.75% for external validation, respectively. As for each category of thyroid scintigrams, the area under the receiver operator characteristic curve (AUC) was 0.986 for ‘diffusely increased,’ 0.997 for ‘diffusely decreased,’ 0.998 for ‘focal increased,’ and 0.945 for ‘heterogeneous uptake’ in internal validation, respectively. Accordingly, the corresponding performances also obtained an ideal result of 0.939, 1.000, 0.974, and 0.915 in external validation, respectively. Conclusions Deep convolutional neural network-based AI model represented considerable performance in the classification of thyroid scintigrams, which may help physicians improve the interpretation of thyroid scintigrams more consistently and efficiently.


Author(s):  
Shweta Dabetwar ◽  
Stephen Ekwaro-Osire ◽  
João Paulo Dias

Abstract Composite materials have enormous applications in various fields. Thus, it is important to have an efficient damage detection method to avoid catastrophic failures. Due to the existence of multiple damage modes and the availability of data in different formats, it is important to employ efficient techniques to consider all the types of damage. Deep neural networks were seen to exhibit the ability to address similar complex problems. The research question in this work is ‘Can data fusion improve damage classification using the convolutional neural network?’ The specific aims developed were to 1) assess the performance of image encoding algorithms, 2) classify the damage using data from separate experimental coupons, and 3) classify the damage using mixed data from multiple experimental coupons. Two different experimental measurements were taken from NASA Ames Prognostic Repository for Carbon Fiber Reinforced polymer. To use data fusion, the piezoelectric signals were converted into images using Gramian Angular Field (GAF) and Markov Transition Field. Using data fusion techniques, the input dataset was created for a convolutional neural network with three hidden layers to determine the damage states. The accuracies of all the image encoding algorithms were compared. The analysis showed that data fusion provided better results as it contained more information on the damages modes that occur in composite materials. Additionally, GAF was shown to perform the best. Thus, the combination of data fusion and deep neural network techniques provides an efficient method for damage detection of composite materials.


Author(s):  
Amira Ahmad Al-Sharkawy ◽  
Gehan A. Bahgat ◽  
Elsayed E. Hemayed ◽  
Samia Abdel-Razik Mashali

Object classification problem is essential in many applications nowadays. Human can easily classify objects in unconstrained environments easily. Classical classification techniques were far away from human performance. Thus, researchers try to mimic the human visual system till they reached the deep neural networks. This chapter gives a review and analysis in the field of the deep convolutional neural network usage in object classification under constrained and unconstrained environment. The chapter gives a brief review on the classical techniques of object classification and the development of bio-inspired computational models from neuroscience till the creation of deep neural networks. A review is given on the constrained environment issues: the hardware computing resources and memory, the object appearance and background, and the training and processing time. Datasets that are used to test the performance are analyzed according to the images environmental conditions, besides the dataset biasing is discussed.


2022 ◽  
pp. 1559-1575
Author(s):  
Mário Pereira Véstias

Machine learning is the study of algorithms and models for computing systems to do tasks based on pattern identification and inference. When it is difficult or infeasible to develop an algorithm to do a particular task, machine learning algorithms can provide an output based on previous training data. A well-known machine learning model is deep learning. The most recent deep learning models are based on artificial neural networks (ANN). There exist several types of artificial neural networks including the feedforward neural network, the Kohonen self-organizing neural network, the recurrent neural network, the convolutional neural network, the modular neural network, among others. This article focuses on convolutional neural networks with a description of the model, the training and inference processes and its applicability. It will also give an overview of the most used CNN models and what to expect from the next generation of CNN models.


2013 ◽  
Vol 13 (3) ◽  
pp. 535-544 ◽  
Author(s):  
A. Alqudah ◽  
V. Chandrasekar ◽  
M. Le

Abstract. Rainfall observed on the ground is dependent on the four dimensional structure of precipitation aloft. Scanning radars can observe the four dimensional structure of precipitation. Neural network is a nonparametric method to represent the nonlinear relationship between radar measurements and rainfall rate. The relationship is derived directly from a dataset consisting of radar measurements and rain gauge measurements. The performance of neural network based rainfall estimation is subject to many factors, such as the representativeness and sufficiency of the training dataset, the generalization capability of the network to new data, seasonal changes, and regional changes. Improving the performance of the neural network for real time applications is of great interest. The goal of this paper is to investigate the performance of rainfall estimation based on Radial Basis Function (RBF) neural networks using radar reflectivity as input and rain gauge as the target. Data from Melbourne, Florida NEXRAD (Next Generation Weather Radar) ground radar (KMLB) over different years along with rain gauge measurements are used to conduct various investigations related to this problem. A direct gauge comparison study is done to demonstrate the improvement brought in by the neural networks and to show the feasibility of this system. The principal components analysis (PCA) technique is also used to reduce the dimensionality of the training dataset. Reducing the dimensionality of the input training data will reduce the training time as well as reduce the network complexity which will also avoid over fitting.


Sign in / Sign up

Export Citation Format

Share Document