video segmentation Latest Research Papers

With the development of machine learning, as a branch of machine learning, deep learning has been applied in many fields such as image recognition, image segmentation, video segmentation, and so on. In recent years, deep learning has also been gradually applied to food recognition. However, in the field of food recognition, the degree of complexity is high, the situation is complex, and the accuracy and speed of recognition are worrying. This paper tries to solve the above problems and proposes a food image recognition method based on neural network. Combining Tiny-YOLO and twin network, this method proposes a two-stage learning mode of YOLO-SIMM and designs two versions of YOLO-SiamV1 and YOLO-SiamV2. Through experiments, this method has a general recognition accuracy. However, there is no need for manual marking, and it has a good development prospect in practical popularization and application. In addition, a method for foreign body detection and recognition in food is proposed. This method can effectively separate foreign body from food by threshold segmentation technology. Experimental results show that this method can effectively distinguish desiccant from foreign matter and achieve the desired effect.

Download Full-text

Temporal Video Segmentation Using Optical Flow Estimation

Iraqi Journal of Science ◽

10.24996/ijs.2021.62.11.36 ◽

2021 ◽

pp. 4181-4194

Author(s):

Eman Hato

Keyword(s):

Optical Flow ◽

Video Segmentation ◽

Computational Cost ◽

Superior Performance ◽

Detection Accuracy ◽

Motion Feature ◽

Detection Process ◽

Shot Detection ◽

Temporal Video Segmentation ◽

Shot Boundary

Shot boundary detection is the process of segmenting a video into basic units known as shots by discovering transition frames between shots. Researches have been conducted to accurately detect the shot boundaries. However, the acceleration of the shot detection process with higher accuracy needs improvement. A new method was introduced in this paper to find out the boundaries of abrupt shots in the video with high accuracy and lower computational cost. The proposed method consists of two stages. First, projection features were used to distinguish non boundary transitions and candidate transitions that may contain abrupt boundary. Only candidate transitions were conserved for next stage. Thus, the speed of shot detection was improved by reducing the detection scope. In the second stage, the candidate segments were refined using motion feature derived from the optical flow to remove non boundary frames. The results manifest that the proposed method achieved excellent detection accuracy (0.98 according to F-Score) and effectively speeded up detection process. In addition, the comparative analysis results confirmed the superior performance of the proposed method versus other methods.

Download Full-text

Lane and Road Marker Semantic Video Segmentation Using Mask Cropping and Optical Flow Estimation

Sensors ◽

10.3390/s21217156 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7156

Author(s):

Guansheng Xing ◽

Ziming Zhu

Keyword(s):

Optical Flow ◽

Video Segmentation ◽

Learning Algorithm ◽

Time Consistency ◽

Autonomous Driving ◽

Target Area ◽

Temporal Consistency ◽

Current Frame ◽

The Past ◽

Single Output

Lane and road marker segmentation is crucial in autonomous driving, and many related methods have been proposed in this field. However, most of them are based on single-frame prediction, which causes unstable results between frames. Some semantic multi-frame segmentation methods produce error accumulation and are not fast enough. Therefore, we propose a deep learning algorithm that takes into account the continuity information of adjacent image frames, including image sequence processing and an end-to-end trainable multi-input single-output network to jointly process the segmentation of lanes and road markers. In order to emphasize the location of the target with high probability in the adjacent frames and to refine the segmentation result of the current frame, we explicitly consider the time consistency between frames, expand the segmentation region of the previous frame, and use the optical flow of the adjacent frames to reverse the past prediction, then use it as an additional input of the network in training and reasoning, thereby improving the network’s attention to the target area of the past frame. We segmented lanes and road markers on the Baidu Apolloscape lanemark segmentation dataset and CULane dataset, and present benchmarks for different networks. The experimental results show that this method accelerates the segmentation speed of video lanes and road markers by 2.5 times, increases accuracy by 1.4%, and reduces temporal consistency by only 2.2% at most.

Download Full-text

Vehicle Re-Identification and Tracking Based on Video Segmentation

10.1145/3487075.3487185 ◽

2021 ◽

Author(s):

Liangru Xiang ◽

Zhijia Yu ◽

Jianming Hu ◽

Yi Zhang

Keyword(s):

Video Segmentation

Download Full-text

An Automatic Classification Method of Sports Teaching Video Using Support Vector Machine

Scientific Programming ◽

10.1155/2021/4728584 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Zhang Min-qing ◽

Li Wen-ping

Keyword(s):

Support Vector Machine ◽

Video Segmentation ◽

Prediction Method ◽

Classification Performance ◽

Video Data ◽

Support Vector ◽

Classification Algorithms ◽

Video Classification ◽

Video Feature ◽

Different Types

There are many different types of sports training films, and categorizing them can be difficult. As a result, this research introduces an autonomous video content classification system that makes managing large amounts of video data easier. This research provides a video feature extraction approach using a support vector machine (SVM) video classification algorithm and a mix of video and audio dual-mode characteristics. It automates the classification of cartoons, ads, music, news, and sports videos, as well as the detection of terrorist and violent moments in films. To begin, a new feature expression scheme, the MPEG-7 visual descriptor subcombination, is proposed based on an analysis of the existing video classification algorithms, with the goal of addressing the problems in these algorithms. This is accomplished by analyzing the visual differences of the five video classification algorithms. The model was able to extract 9 descriptors from the four characteristics of color, texture, shape, and motion, resulting in a new overall visual feature with good results. The results suggest that the algorithm optimizes video segmentation by highlighting disparities in feature selection between different categories of films. Second, the support vector machine’s multivideo classification performance is improved by the enhanced secondary prediction method. Finally, a comparison experiment with current related similar algorithms was conducted. The suggested method outperformed the competition in the accuracy of video classification in five different types of videos, as well as in the recognition of terrorist and violent incidents.

Download Full-text