Real-Time Face Occlusion Recognition Algorithm Based on Feature Fusion

In the near future, combo of UAV (Unmanned Aerial Vehicle) and computer vision will play a vital role in monitoring the condition of the railroad periodically to ensure passenger safety. The most significant module involved in railroad visual processing is obstacle detection, in which caution is obstacle fallen near track gage inside or outside. This leads to the importance of detecting and segment the railroad as three key regions, such as gage inside, rails, and background. Traditional railroad segmentation methods depend on either manual feature selection or expensive dedicated devices such as Lidar, which is typically less reliable in railroad semantic segmentation. Also, cameras mounted on moving vehicles like a drone can produce high-resolution images, so segmenting precise pixel information from those aerial images has been challenging due to the railroad surroundings chaos. RSNet is a multi-level feature fusion algorithm for segmenting railroad aerial images captured by UAV and proposes an attention-based efficient convolutional encoder for feature extraction, which is robust and computationally efficient and modified residual decoder for segmentation which considers only essential features and produces less overhead with higher performance even in real-time railroad drone imagery. The network is trained and tested on a railroad scenic view segmentation dataset (RSSD), which we have built from real-time UAV images and achieves 0.973 dice coefficient and 0.94 jaccard on test data that exhibits better results compared to the existing approaches like a residual unit and residual squeeze net.

Download Full-text

A Real-Time Apple Targets Detection Method for Picking Robot Based on Improved YOLOv5

Remote Sensing ◽

10.3390/rs13091619 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1619

Author(s):

Bin Yan ◽

Pan Fan ◽

Xiaoyan Lei ◽

Zhijie Liu ◽

Fuzeng Yang

Keyword(s):

Real Time ◽

Apple Tree ◽

Detection Method ◽

Recognition Algorithm ◽

Medium Size ◽

Feature Maps ◽

Detection Algorithms ◽

Fusion Mode ◽

Visual Attention Mechanism ◽

Tree Image

The apple target recognition algorithm is one of the core technologies of the apple picking robot. However, most of the existing apple detection algorithms cannot distinguish between the apples that are occluded by tree branches and occluded by other apples. The apples, grasping end-effector and mechanical picking arm of the robot are very likely to be damaged if the algorithm is directly applied to the picking robot. Based on this practical problem, in order to automatically recognize the graspable and ungraspable apples in an apple tree image, a light-weight apple targets detection method was proposed for picking robot using improved YOLOv5s. Firstly, BottleneckCSP module was improved designed to BottleneckCSP-2 module which was used to replace the BottleneckCSP module in backbone architecture of original YOLOv5s network. Secondly, SE module, which belonged to the visual attention mechanism network, was inserted to the proposed improved backbone network. Thirdly, the bonding fusion mode of feature maps, which were inputs to the target detection layer of medium size in the original YOLOv5s network, were improved. Finally, the initial anchor box size of the original network was improved. The experimental results indicated that the graspable apples, which were unoccluded or only occluded by tree leaves, and the ungraspable apples, which were occluded by tree branches or occluded by other fruits, could be identified effectively using the proposed improved network model in this study. Specifically, the recognition recall, precision, mAP and F1 were 91.48%, 83.83%, 86.75% and 87.49%, respectively. The average recognition time was 0.015 s per image. Contrasted with original YOLOv5s, YOLOv3, YOLOv4 and EfficientDet-D0 model, the mAP of the proposed improved YOLOv5s model increased by 5.05%, 14.95%, 4.74% and 6.75% respectively, the size of the model compressed by 9.29%, 94.6%, 94.8% and 15.3% respectively. The average recognition speeds per image of the proposed improved YOLOv5s model were 2.53, 1.13 and 3.53 times of EfficientDet-D0, YOLOv4 and YOLOv3 and model, respectively. The proposed method can provide technical support for the real-time accurate detection of multiple fruit targets for the apple picking robot.

Download Full-text

Feature Fusion based Efficient Convolution Network for Real-time Table Tennis Ball Detection

2020 13th International Symposium on Computational Intelligence and Design (ISCID) ◽

10.1109/iscid51228.2020.00073 ◽

2020 ◽

Author(s):

Xinjun Sheng ◽

Xiangyang Zhu ◽

Luo Yang ◽

Haibo Zhang

Keyword(s):

Real Time ◽

Feature Fusion ◽

Tennis Ball ◽

Table Tennis ◽

Time Table ◽

Ball Detection

Download Full-text

Human interaction recognition method based on parallel multi-feature fusion network

Intelligent Data Analysis ◽

10.3233/ida-205217 ◽

2021 ◽

Vol 25 (4) ◽

pp. 809-823

Author(s):

Qing Ye ◽

Haoxin Zhong ◽

Chang Qu ◽

Yongmei Zhang

Keyword(s):

Activity Recognition ◽

Feature Fusion ◽

Gaussian Model ◽

Research Direction ◽

Recognition Algorithm ◽

Human Interaction ◽

Amount Of Information ◽

Interaction Recognition ◽

Time Periods ◽

Human Interaction Recognition

Human activity recognition is a key technology in intelligent video surveillance and an important research direction in the field of computer vision. However, the complexity of human interaction features and the differences in motion characteristics at different time periods have always existed. In this paper, a human interaction recognition algorithm based on parallel multi-feature fusion network is proposed. First of all, in view of the different amount of information provided by the different time periods of action, an improved time-phased video down sampling method based on Gaussian model is proposed. Second, the Inception module uses different scale convolution kernels for feature extraction. It can improve network performance and reduce the amount of network parameters at the same time. The ResNet module mitigates degradation problem due to increased depth of neural networks and achieves higher classification accuracy. The amount of information provided in the motion video in different stages of motion time is also different. Therefore, we combine the advantages of the Inception network and ResNet to extract feature information, and then we integrate the extracted features. After the extracted features are merged, the training is continued to realize parallel connection of the multi-feature neural network. In this paper, experiments are carried out on the UT dataset. Compared with the traditional activity recognition algorithm, this method can accomplish the recognition tasks of six kinds of interactive actions in a better way, and its accuracy rate reaches 88.9%.

Download Full-text

Modulation Recognition Algorithm Based on ResNet50 Multi-feature Fusion

10.1109/icitbs53129.2021.00171 ◽

2021 ◽

Author(s):

Xianghui Liu ◽

Zhongdong Wu ◽

Chunyang Tang

Keyword(s):

Feature Fusion ◽

Recognition Algorithm ◽

Modulation Recognition

Download Full-text

Real-time tracking of visual objects based on deep feature fusion

10.1109/iciba50161.2020.9277146 ◽

2020 ◽

Author(s):

Yingkun Yang ◽

Honghao Cai ◽

Yi Qu

Keyword(s):

Real Time ◽

Feature Fusion ◽

Visual Objects ◽

Deep Feature ◽

Real Time Tracking

Download Full-text

A Multimodal Feature Fusion-Based Deep Learning Method for Online Fault Diagnosis of Rotating Machinery

Sensors ◽

10.3390/s18103521 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3521 ◽

Cited By ~ 6

Author(s):

Funa Zhou ◽

Po Hu ◽

Shuai Yang ◽

Chenglin Wen

Keyword(s):

Deep Learning ◽

Fault Diagnosis ◽

Real Time ◽

Time Domain ◽

Feature Fusion ◽

Rotating Machinery ◽

Time Domain Data ◽

Diagnosis Method ◽

The Time Domain ◽

Potential Frequency

Rotating machinery usually suffers from a type of fault, where the fault feature extracted in the frequency domain is significant, while the fault feature extracted in the time domain is insignificant. For this type of fault, a deep learning-based fault diagnosis method developed in the frequency domain can reach high accuracy performance without real-time performance, whereas a deep learning-based fault diagnosis method developed in the time domain obtains real-time diagnosis with lower diagnosis accuracy. In this paper, a multimodal feature fusion-based deep learning method for accurate and real-time online diagnosis of rotating machinery is proposed. The proposed method can directly extract the potential frequency of abnormal features involved in the time domain data. Firstly, multimodal features corresponding to the original data, the slope data, and the curvature data are firstly extracted by three separate deep neural networks. Then, a multimodal feature fusion is developed to obtain a new fused feature that can characterize the potential frequency feature involved in the time domain data. Lastly, the fused new feature is used as the input of the Softmax classifier to achieve a real-time online diagnosis result from the frequency-type fault data. A simulation experiment and a case study of the bearing fault diagnosis confirm the high efficiency of the method proposed in this paper.

Download Full-text