scholarly journals LPNet: Retina Inspired Neural Network for Object Detection and Recognition

Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2883
Author(s):  
Jie Cao ◽  
Chun Bao ◽  
Qun Hao ◽  
Yang Cheng ◽  
Chenglin Chen

The detection of rotated objects is a meaningful and challenging research work. Although the state-of-the-art deep learning models have feature invariance, especially convolutional neural networks (CNNs), their architectures did not specifically design for rotation invariance. They only slightly compensate for this feature through pooling layers. In this study, we propose a novel network, named LPNet, to solve the problem of object rotation. LPNet improves the detection accuracy by combining retina-like log-polar transformation. Furthermore, LPNet is a plug-and-play architecture for object detection and recognition. It consists of two parts, which we name as encoder and decoder. An encoder extracts images which feature in log-polar coordinates while a decoder eliminates image noise in cartesian coordinates. Moreover, according to the movement of center points, LPNet has stable and sliding modes. LPNet takes the single-shot multibox detector (SSD) network as the baseline network and the visual geometry group (VGG16) as the feature extraction backbone network. The experiment results show that, compared with conventional SSD networks, the mean average precision (mAP) of LPNet increased by 3.4% for regular objects and by 17.6% for rotated objects.

Author(s):  
Aofeng Li ◽  
Xufang Zhu ◽  
Shuo He ◽  
Jiawei Xia

AbstractIn view of the deficiencies in traditional visual water surface object detection, such as the existence of non-detection zones, failure to acquire global information, and deficiencies in a single-shot multibox detector (SSD) object detection algorithm such as remote detection and low detection precision of small objects, this study proposes a water surface object detection algorithm from panoramic vision based on an improved SSD. We reconstruct the backbone network for the SSD algorithm, replace VVG16 with a ResNet-50 network, and add five layers of feature extraction. More abundant semantic information of the shallow feature graph is obtained through a feature pyramid network structure with deconvolution. An experiment is conducted by building a water surface object dataset. Results showed the mean Average Precision (mAP) of the improved algorithm are increased by 4.03%, compared with the existing SSD detecting Algorithm. Improved algorithm can effectively improve the overall detection precision of water surface objects and enhance the detection effect of remote objects.


2019 ◽  
Vol 11 (7) ◽  
pp. 786 ◽  
Author(s):  
Yang-Lang Chang ◽  
Amare Anagaw ◽  
Lena Chang ◽  
Yi Wang ◽  
Chih-Yu Hsiao ◽  
...  

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.


2020 ◽  
Vol 12 (3) ◽  
pp. 458 ◽  
Author(s):  
Ugur Alganci ◽  
Mehmet Soydas ◽  
Elif Sertel

Object detection from satellite images has been a challenging problem for many years. With the development of effective deep learning algorithms and advancement in hardware systems, higher accuracies have been achieved in the detection of various objects from very high-resolution (VHR) satellite images. This article provides a comparative evaluation of the state-of-the-art convolutional neural network (CNN)-based object detection models, which are Faster R-CNN, Single Shot Multi-box Detector (SSD), and You Look Only Once-v3 (YOLO-v3), to cope with the limited number of labeled data and to automatically detect airplanes in VHR satellite images. Data augmentation with rotation, rescaling, and cropping was applied on the test images to artificially increase the number of training data from satellite images. Moreover, a non-maximum suppression algorithm (NMS) was introduced at the end of the SSD and YOLO-v3 flows to get rid of the multiple detection occurrences near each detected object in the overlapping areas. The trained networks were applied to five independent VHR test images that cover airports and their surroundings to evaluate their performance objectively. Accuracy assessment results of the test regions proved that Faster R-CNN architecture provided the highest accuracy according to the F1 scores, average precision (AP) metrics, and visual inspection of the results. The YOLO-v3 ranked as second, with a slightly lower performance but providing a balanced trade-off between accuracy and speed. The SSD provided the lowest detection performance, but it was better in object localization. The results were also evaluated in terms of the object size and detection accuracy manner, which proved that large- and medium-sized airplanes were detected with higher accuracy.


2021 ◽  
Vol 11 (3) ◽  
pp. 1096
Author(s):  
Qing Li ◽  
Yingcheng Lin ◽  
Wei He

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.


2022 ◽  
Vol 2022 ◽  
pp. 1-11
Author(s):  
Cong Lin ◽  
Yongbin Zheng ◽  
Xiuchun Xiao ◽  
Jialun Lin

The workload of radiologists has dramatically increased in the context of the COVID-19 pandemic, causing misdiagnosis and missed diagnosis of diseases. The use of artificial intelligence technology can assist doctors in locating and identifying lesions in medical images. In order to improve the accuracy of disease diagnosis in medical imaging, we propose a lung disease detection neural network that is superior to the current mainstream object detection model in this paper. By combining the advantages of RepVGG block and Resblock in information fusion and information extraction, we design a backbone RRNet with few parameters and strong feature extraction capabilities. After that, we propose a structure called Information Reuse, which can solve the problem of low utilization of the original network output features by connecting the normalized features back to the network. Combining the network of RRNet and the improved RefineDet, we propose the overall network which was called CXR-RefineDet. Through a large number of experiments on the largest public lung chest radiograph detection dataset VinDr-CXR, it is found that the detection accuracy and inference speed of CXR-RefineDet have reached 0.1686 mAP and 6.8 fps, respectively, which is better than the two-stage object detection algorithm using a strong backbone like ResNet-50 and ResNet-101. In addition, the fast reasoning speed of CXR-RefineDet also provides the possibility for the actual implementation of the computer-aided diagnosis system.


2021 ◽  
Author(s):  
Lu Tan ◽  
Tianran Huangfu ◽  
Liyao Wu ◽  
Wenying Chen

Abstract Background: The correct identification of pills is very important to ensure the safe administration of drugs to patients. We used three currently mainstream object detection models, respectively Faster R-CNN, Single Shot Multi-Box Detector (SSD), and You Only Look Once v3(YOLO v3), to identify pills and compare the associated performance.Methods: In this paper, we introduce the basic principles of three object detection models. We trained each algorithm on a pill image dataset and analyzed the performance of the three models to determine the best pill recognition model. Finally, these models are then used to detect difficult samples and compare the results.Results: The mean average precision (MAP) of Faster R-CNN reached 87.69% but YOLO v3 had a significant advantage in detection speed where the frames per second (FPS) was more than eight times than that of Faster R-CNN. This means that YOLO v3 can operate in real time with a high MAP of 80.17%. The YOLO v3 algorithm also performed better in the comparison of difficult sample detection results. In contrast, SSD did not achieve the highest score in terms of MAP or FPS.Conclusion: Our study shows that YOLO v3 has advantages in detection speed while maintaining certain MAP and thus can be applied for real-time pill identification in a hospital pharmacy environment.


2021 ◽  
Vol 143 (7) ◽  
Author(s):  
Yanbiao Zou ◽  
Mingquan Zhu ◽  
Xiangzhi Chen

Abstract Accurate locating of the weld seam under strong noise is the biggest challenge for automated welding. In this paper, we construct a robust seam detector on the framework of deep learning object detection algorithm. The representative object algorithm, a single shot multibox detector (SSD), is studied to establish the seam detector framework. The improved SSD is applied to seam detection. Under the SSD object detection framework, combined with the characteristics of the seam detection task, the multifeature combination network (MFCN) is proposed. The network comprehensively utilizes the local information and global information carried by the multilayer features to detect a weld seam and realizes the rapid and accurate detection of the weld seam. To solve the problem of single-frame seam image detection algorithm failure under continuous super-strong noise, the sequence image multifeature combination network (SMFCN) is proposed based on the MFCN detector. The recurrent neural network (RNN) is used to learn the temporal context information of convolutional features to accurately detect the seam under continuous super-noise. Experimental results show that the proposed seam detectors are extremely robust. The SMFCN can maintain extremely high detection accuracy under continuous super-strong noise. The welding results show that the laser vision seam tracking system using the SMFCN can ensure that the welding precision meets industrial requirements under a welding current of 150 A.


2021 ◽  
Vol 13 (12) ◽  
pp. 307
Author(s):  
Vijayakumar Varadarajan ◽  
Dweepna Garg ◽  
Ketan Kotecha

Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models.


2021 ◽  
Author(s):  
Jinde Zhua

Abstract The detection of marine organisms is an important part of the intelligent strategy in marine ranch, which requires an underwater robot to detect the marine organism quickly and accurately in the complex ocean environment. Based on the latest deep learning arithmetic, this paper put forward to find the marine organism in a picture or video to construct a real-time objective invention system for marine organisms. The neural network arithmetic: YOLOv4 was employed to extract the deep features of marine organisms, implementing the accurate detection and size detection of different fish can use arithmetic for evaluation in fisheries. Furthermore, improving the architecture of the backbone and the neck connection is called YOLOv4-embedding. As a result, compared with other object detection arithmetic, YOLOv4-embedding object detection arithmetic was better at detection accuracy--higher detection confidence and higher detection ratio than other one-stage object detection arithmetic, EfficientDet-D3 example. The consequence demonstrates that the suggested instrument could implement the rapid invention of different varieties in marine organisms. Compared to the YOLOv4, the mAP 75 of the YOLOv4-embedding achieves an improvement of 2.92% for the marine organism dataset at a rapid rate of ~51 FPS on RTX 3090, 60.8% AP 50 for the MS COCO dataset.


2019 ◽  
Vol 9 (14) ◽  
pp. 2785 ◽  
Author(s):  
Yun Jiang ◽  
Tingting Peng ◽  
Ning Tan

Single Shot MultiBox Detector (SSD) has achieved good results in object detection but there are problems such as insufficient understanding of context information and loss of features in deep layers. In order to alleviate these problems, we propose a single-shot object detection network Context Perception-SSD (CP-SSD). CP-SSD promotes the network’s understanding of context information by using context information scene perception modules, so as to capture context information for objects of different scales. Deep layer feature map used semantic activation module, through self-supervised learning to adjust the context feature information and channel interdependence, and enhance useful semantic information. CP-SSD was validated on benchmark dataset PASCAL VOC 2007. The experimental results show that, compared with SSD, the mean Average Precision (mAP) of the CP-SSD detection method reaches 77.8%, which is 0.6% higher than that of SSD, and the detection effect was significantly improved in images with difficult to distinguish the object from the background.


Sign in / Sign up

Export Citation Format

Share Document