scholarly journals A Lightweight YOLOv4-Based Forestry Pest Detection Method Using Coordinate Attention and Feature Fusion

Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1587
Author(s):  
Mingfeng Zha ◽  
Wenbin Qian ◽  
Wenlong Yi ◽  
Jing Hua

Traditional pest detection methods are challenging to use in complex forestry environments due to their low accuracy and speed. To address this issue, this paper proposes the YOLOv4_MF model. The YOLOv4_MF model utilizes MobileNetv2 as the feature extraction block and replaces the traditional convolution with depth-wise separated convolution to reduce the model parameters. In addition, the coordinate attention mechanism was embedded in MobileNetv2 to enhance feature information. A symmetric structure consisting of a three-layer spatial pyramid pool is presented, and an improved feature fusion structure was designed to fuse the target information. For the loss function, focal loss was used instead of cross-entropy loss to enhance the network’s learning of small targets. The experimental results showed that the YOLOv4_MF model has 4.24% higher mAP, 4.37% higher precision, and 6.68% higher recall than the YOLOv4 model. The size of the proposed model was reduced to 1/6 of that of YOLOv4. Moreover, the proposed algorithm achieved 38.62% mAP with respect to some state-of-the-art algorithms on the COCO dataset.

2021 ◽  
Vol 233 ◽  
pp. 02012
Author(s):  
Shousheng Liu ◽  
Zhigang Gai ◽  
Xu Chai ◽  
Fengxiang Guo ◽  
Mei Zhang ◽  
...  

Bacterial colonies detecting and counting is tedious and time-consuming work. Fortunately CNN (convolutional neural network) detection methods are effective for target detection. The bacterial colonies are a kind of small targets, which have been a difficult problem in the field of target detection technology. This paper proposes a small target enhancement detection method based on double CNNs, which can not only improve the detection accuracy, but also maintain the detection speed similar to the general detection model. The detection method uses double CNNs. The first CNN uses SSD_MOBILENET_V1 network with both target positioning and target recognition functions. The candidate targets are screened out with a low confidence threshold, which can ensure no missing detection of small targets. The second CNN obtains candidate target regions according to the first round of detection, intercepts image sub-blocks one by one, uses the MOBILENET_V1 network to filter out targets with a higher confidence threshold, which can ensure good detection of small targets. Through the two-round enhancement detection method has been transplanted to the embedded platform NVIDIA Jetson AGX Xavier, the detection accuracy of small targets is significantly improved, and the target error detection rate and missed detection rate are reduced to less than 1%.


2014 ◽  
Vol 28 (28) ◽  
pp. 1450199
Author(s):  
Shengze Hu ◽  
Zhenwen Wang

In the real world, a large amount of systems can be described by networks where nodes represent entities and edges the interconnections between them. Community structure in networks is one of the interesting properties revealed in the study of networks. Many methods have been developed to extract communities from networks using the generative models which give the probability of generating networks based on some assumption about the communities. However, many generative models require setting the number of communities in the network. The methods based on such models are lack of practicality, because the number of communities is unknown before determining the communities. In this paper, the Bayesian nonparametric method is used to develop a new community detection method. First, a generative model is built to give the probability of generating the network and its communities. Next, the model parameters and the number of communities are calculated by fitting the model to the actual network. Finally, the communities in the network can be determined using the model parameters. In the experiments, we apply the proposed method to the synthetic and real-world networks, comparing with some other community detection methods. The experimental results show that the proposed method is efficient to detect communities in networks.


Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2444
Author(s):  
Mazhar Javed Awan ◽  
Osama Ahmed Masood ◽  
Mazin Abed Mohammed ◽  
Awais Yasin ◽  
Azlan Mohd Zain ◽  
...  

In recent years the amount of malware spreading through the internet and infecting computers and other communication devices has tremendously increased. To date, countless techniques and methodologies have been proposed to detect and neutralize these malicious agents. However, as new and automated malware generation techniques emerge, a lot of malware continues to be produced, which can bypass some state-of-the-art malware detection methods. Therefore, there is a need for the classification and detection of these adversarial agents that can compromise the security of people, organizations, and countless other forms of digital assets. In this paper, we propose a spatial attention and convolutional neural network (SACNN) based on deep learning framework for image-based classification of 25 well-known malware families with and without class balancing. Performance was evaluated on the Malimg benchmark dataset using precision, recall, specificity, precision, and F1 score on which our proposed model with class balancing reached 97.42%, 97.95%, 97.33%, 97.11%, and 97.32%. We also conducted experiments on SACNN with class balancing on benign class, also produced above 97%. The results indicate that our proposed model can be used for image-based malware detection with high performance, despite being simpler as compared to other available solutions.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Zhongmin Liu ◽  
Zhicai Chen ◽  
Zhanming Li ◽  
Wenjin Hu

In recent years, techniques based on the deep detection model have achieved overwhelming improvements in the accuracy of detection, which makes them being the most adapted for the applications, such as pedestrian detection. However, speed and accuracy are a pair of contradictions that always exist and have long puzzled researchers. How to achieve the good trade-off between them is a problem we must consider while designing the detectors. To this end, we employ the general detector YOLOv2, a state-of-the-art method in the general detection tasks, in the pedestrian detection. Then we modify the network parameters and structures, according to the characteristics of the pedestrians, making this method more suitable for detecting pedestrians. Experimental results in INRIA pedestrian detection dataset show that it has a fairly high detection speed with a small precision gap compared with the state-of-the-art pedestrian detection methods. Furthermore, we add weak semantic segmentation networks after shared convolution layers to illuminate pedestrians and employ a scale-aware structure in our model according to the characteristics of the wide size range in Caltech pedestrian detection dataset, which make great progress under the original improvement.


Author(s):  
Duowei Tang ◽  
Peter Kuppens ◽  
Luc Geurts ◽  
Toon van Waterschoot

AbstractAmongst the various characteristics of a speech signal, the expression of emotion is one of the characteristics that exhibits the slowest temporal dynamics. Hence, a performant speech emotion recognition (SER) system requires a predictive model that is capable of learning sufficiently long temporal dependencies in the analysed speech signal. Therefore, in this work, we propose a novel end-to-end neural network architecture based on the concept of dilated causal convolution with context stacking. Firstly, the proposed model consists only of parallelisable layers and is hence suitable for parallel processing, while avoiding the inherent lack of parallelisability occurring with recurrent neural network (RNN) layers. Secondly, the design of a dedicated dilated causal convolution block allows the model to have a receptive field as large as the input sequence length, while maintaining a reasonably low computational cost. Thirdly, by introducing a context stacking structure, the proposed model is capable of exploiting long-term temporal dependencies hence providing an alternative to the use of RNN layers. We evaluate the proposed model in SER regression and classification tasks and provide a comparison with a state-of-the-art end-to-end SER model. Experimental results indicate that the proposed model requires only 1/3 of the number of model parameters used in the state-of-the-art model, while also significantly improving SER performance. Further experiments are reported to understand the impact of using various types of input representations (i.e. raw audio samples vs log mel-spectrograms) and to illustrate the benefits of an end-to-end approach over the use of hand-crafted audio features. Moreover, we show that the proposed model can efficiently learn intermediate embeddings preserving speech emotion information.


2018 ◽  
Vol 232 ◽  
pp. 04036
Author(s):  
Jun Yin ◽  
Huadong Pan ◽  
Hui Su ◽  
Zhonggeng Liu ◽  
Zhirong Peng

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.


Author(s):  
Ning-Min Shen ◽  
Jing Li ◽  
Pei-Yun Zhou ◽  
Ying Huo ◽  
Yi Zhuang

Co-saliency detection, an emerging research area in saliency detection, aims to extract the common saliency from the multi images. The extracted co-saliency map has been utilized in various applications, such as in co-segmentation, co-recognition and so on. With the rapid development of image acquisition technology, the original digital images are becoming more and more clearly. The existing co-saliency detection methods processing these images need enormous computer memory along with high computational complexity. These limitations made it hard to satisfy the demand of real-time user interaction. This paper proposes a fast co-saliency detection method based on the image block partition and sparse feature extraction method (BSFCoS). Firstly, the images are divided into several uniform blocks, and the low-level features are extracted from Lab and RGB color spaces. In order to maintain the characteristics of the original images and reduce the number of feature points as well as possible, Truncated Power for sparse principal components method are employed to extract sparse features. Furthermore, K-Means method is adopted to cluster the extracted sparse features, and calculate the three salient feature weights. Finally, the co-saliency map was acquired from the feature fusion of the saliency map for single image and multi images. The proposed method has been tested and simulated on two benchmark datasets: Co-saliency Pairs and CMU Cornell iCoseg datasets. Compared with the existing co-saliency methods, BSFCoS has a significant running time improvement in multi images processing while ensuring detection results. Lastly, the co-segmentation method based on BSFCoS is also given and has a better co-segmentation performance.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4184
Author(s):  
Zhiwei Cao ◽  
Huihua Yang ◽  
Juan Zhao ◽  
Shuhong Guo ◽  
Lingqiao Li

Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs). In this paper, we introduced and adapted a simple and efficient one-stage YOLOv4 to replace the current state-of-the-art two-stage fast-RCNN for multispectral pedestrian detection and to directly predict bounding boxes with confidence scores. To further improve the detection performance, we analyzed the existing multispectral fusion methods and proposed a novel multispectral channel feature fusion (MCFF) module for integrating the features from the color and thermal streams according to the illumination conditions. Moreover, several fusion architectures, such as Early Fusion, Halfway Fusion, Late Fusion, and Direct Fusion, were carefully designed based on the MCFF to transfer the feature information from the bottom to the top at different stages. Finally, the experimental results on the KAIST and Utokyo pedestrian benchmarks showed that Halfway Fusion was used to obtain the best performance of all architectures and the MCFF could adapt fused features in the two modalities. The log-average miss rate (MR) for the two modalities with reasonable settings were 4.91% and 23.14%, respectively.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8369
Author(s):  
Yizhi Luo ◽  
Zhixiong Zeng ◽  
Huazhong Lu ◽  
Enli Lv

In this paper, a lightweight channel-wise attention model is proposed for the real-time detection of five representative pig postures: standing, lying on the belly, lying on the side, sitting, and mounting. An optimized compressed block with symmetrical structure is proposed based on model structure and parameter statistics, and the efficient channel attention modules are considered as a channel-wise mechanism to improve the model architecture.The results show that the algorithm’s average precision in detecting standing, lying on the belly, lying on the side, sitting, and mounting is 97.7%, 95.2%, 95.7%, 87.5%, and 84.1%, respectively, and the speed of inference is around 63 ms (CPU = i7, RAM = 8G) per postures image. Compared with state-of-the-art models (ResNet50, Darknet53, CSPDarknet53, MobileNetV3-Large, and MobileNetV3-Small), the proposed model has fewer model parameters and lower computation complexity. The statistical results of the postures (with continuous 24 h monitoring) show that some pigs will eat in the early morning, and the peak of the pig’s feeding appears after the input of new feed, which reflects the health of the pig herd for farmers.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1091
Author(s):  
Bilal Al-Ahmad ◽  
Ala’ M. Al-Zoubi ◽  
Ruba Abu Khurma ◽  
Ibrahim Aljarah

As the COVID-19 pandemic rapidly spreads across the world, regrettably, misinformation and fake news related to COVID-19 have also spread remarkably. Such misinformation has confused people. To be able to detect such COVID-19 misinformation, an effective detection method should be applied to obtain more accurate information. This will help people and researchers easily differentiate between true and fake news. The objective of this research was to introduce an enhanced evolutionary detection approach to obtain better results compared with the previous approaches. The proposed approach aimed to reduce the number of symmetrical features and obtain a high accuracy after implementing three wrapper feature selections for evolutionary classifications using particle swarm optimization (PSO), the genetic algorithm (GA), and the salp swarm algorithm (SSA). The experiments were conducted on one of the popular datasets called the Koirala dataset. Based on the obtained prediction results, the proposed model revealed an optimistic and superior predictability performance with a high accuracy (75.4%) and reduced the number of features to 303. In addition, by comparison with other state-of-the-art classifiers, our results showed that the proposed detection method with the genetic algorithm model outperformed other classifiers in the accuracy.


Sign in / Sign up

Export Citation Format

Share Document