scholarly journals Multi-object Recognition Method Based on Improved YOLOv2 Model

2021 ◽  
Vol 50 (1) ◽  
pp. 13-27
Author(s):  
Binbin Shi ◽  
Xun Li ◽  
Tingting Nie ◽  
Kaibin Zhang ◽  
Wenjie Wang

A method of vehicle multi-object identification and classification based on the YOLOv2 algorithm is proposed to solve the problems of low detection rate, poor robustness, and unsatisfactory classification effect for the classical multi-object detection and vehicle type classification on real road environment. Based on the YOLOv2 algorithm, the network structure of YOLOv2-voc is improved according to the actual road conditions. The classification training model was obtained based on the ImageNet data and fine-tuning technology, according to the analysis of training results and vehicle object characteristics. This paper proposed the improved vehicle identification classification network structure, namely called YOLOv2-voc_mul. In order to verify the validity of the detection method, experiments are performed using samples from simple backgrounds and complex backgrounds and compared with the existing YOLOv2, YOLOv2-voc, and YOLOv3 models after 70000 iterations, respectively. The results show that the proposed YOLOv2-voc_mul model has an accuracy of 98.6% under the simple background, and the mAP (mean Average Precision) of different models reaches 87.81%. Under the complex background, the improved YOLOv2-voc_mul model has an average accuracy of 92.09% and 89.64% for single and multi-object detection of four different models. In summary, our proposed method has better accuracy, a low false detection rate, and good robustness.

2017 ◽  
Vol 2 (1) ◽  
pp. 80-87
Author(s):  
Puyda V. ◽  
◽  
Stoian. A.

Detecting objects in a video stream is a typical problem in modern computer vision systems that are used in multiple areas. Object detection can be done on both static images and on frames of a video stream. Essentially, object detection means finding color and intensity non-uniformities which can be treated as physical objects. Beside that, the operations of finding coordinates, size and other characteristics of these non-uniformities that can be used to solve other computer vision related problems like object identification can be executed. In this paper, we study three algorithms which can be used to detect objects of different nature and are based on different approaches: detection of color non-uniformities, frame difference and feature detection. As the input data, we use a video stream which is obtained from a video camera or from an mp4 video file. Simulations and testing of the algoritms were done on a universal computer based on an open-source hardware, built on the Broadcom BCM2711, quad-core Cortex-A72 (ARM v8) 64-bit SoC processor with frequency 1,5GHz. The software was created in Visual Studio 2019 using OpenCV 4 on Windows 10 and on a universal computer operated under Linux (Raspbian Buster OS) for an open-source hardware. In the paper, the methods under consideration are compared. The results of the paper can be used in research and development of modern computer vision systems used for different purposes. Keywords: object detection, feature points, keypoints, ORB detector, computer vision, motion detection, HSV model color


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 72528-72537 ◽  
Author(s):  
Hatim Derrouz ◽  
Abderrahim Elbouziady ◽  
Hamd Ait Abdelali ◽  
Rachid Oulad Haj Thami ◽  
Sanaa El Fkihi ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1906
Author(s):  
Jia-Zheng Jian ◽  
Tzong-Rong Ger ◽  
Han-Hua Lai ◽  
Chi-Ming Ku ◽  
Chiung-An Chen ◽  
...  

Diverse computer-aided diagnosis systems based on convolutional neural networks were applied to automate the detection of myocardial infarction (MI) found in electrocardiogram (ECG) for early diagnosis and prevention. However, issues, particularly overfitting and underfitting, were not being taken into account. In other words, it is unclear whether the network structure is too simple or complex. Toward this end, the proposed models were developed by starting with the simplest structure: a multi-lead features-concatenate narrow network (N-Net) in which only two convolutional layers were included in each lead branch. Additionally, multi-scale features-concatenate networks (MSN-Net) were also implemented where larger features were being extracted through pooling the signals. The best structure was obtained via tuning both the number of filters in the convolutional layers and the number of inputting signal scales. As a result, the N-Net reached a 95.76% accuracy in the MI detection task, whereas the MSN-Net reached an accuracy of 61.82% in the MI locating task. Both networks give a higher average accuracy and a significant difference of p < 0.001 evaluated by the U test compared with the state-of-the-art. The models are also smaller in size thus are suitable to fit in wearable devices for offline monitoring. In conclusion, testing throughout the simple and complex network structure is indispensable. However, the way of dealing with the class imbalance problem and the quality of the extracted features are yet to be discussed.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Xun Li ◽  
Yao Liu ◽  
Zhengfan Zhao ◽  
Yue Zhang ◽  
Li He

Vehicle detection is expected to be robust and efficient in various scenes. We propose a multivehicle detection method, which consists of YOLO under the Darknet framework. We also improve the YOLO-voc structure according to the change of the target scene and traffic flow. The classification training model is obtained based on ImageNet and the parameters are fine-tuned according to the training results and the vehicle characteristics. Finally, we obtain an effective YOLO-vocRV network for road vehicles detection. In order to verify the performance of our method, the experiment is carried out on different vehicle flow states and compared with the classical YOLO-voc, YOLO 9000, and YOLO v3. The experimental results show that our method achieves the detection rate of 98.6% in free flow state, 97.8% in synchronous flow state, and 96.3% in blocking flow state, respectively. In addition, our proposed method has less false detection rate than previous works and shows good robustness.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042017
Author(s):  
Yingdong Ru

Abstract Music symbol recognition is an important part of Optical Music Recognition (OMR), Chord recognition is one of the most important research contents in the field of music information retrieval. It plays an important role in information processing, music structure analysis, and recommendation systems. Aiming at the problem of low chord recognition accuracy in the OMR recognition model, the article proposes a chord recognition method based on the YOLOV4 neural network model. First, the YOLOV4 network model is used to train single-voice scores to obtain the best training model. Then, the scores containing chords are trained through neural network fine-tuning technology. The experimental results show that the method recognizes the chords with great results, the model was tested on the test set generated by MuseScore. The experimental results show that the accuracy of note recognition is high, which can reach the accuracy of duration value of 0.96 which is higher than the accuracy of note recognition of other score recognition models.


Author(s):  
Garima Devnani ◽  
Ayush Jaiswal ◽  
Roshni John ◽  
Rajat Chaurasia ◽  
Neha Tirpude

<span lang="EN-US">Fine-tuning of a model is a method that is most often required to cater to the users’ explicit requirements. But the question remains whether the model is accurate enough to be used for a certain application. This paper strives to present the metrics used for performance evaluation of a Convolutional Neural Network (CNN) model. The evaluation is based on the training process which provides us with intermediate models after every 1000 iterations. While 1000 iterations are not substantial enough over the range of 490k iterations, the groups are sized with 100k iterations each. Now, the intention was to compare the recorded metrics to evaluate the model in terms of accuracy. The training model used the set of specific categories chosen from the Microsoft Common Objects in Context (MS COCO) dataset while allowing the users to use their externally available images to test the model’s accuracy. Our trained model ensured that all the objects are detected that are present in the image to depict the effect of precision.</span>


Sign in / Sign up

Export Citation Format

Share Document