Random Forest with Adaptive Local Template for Pedestrian Detection

Mathematical Problems in Engineering ◽

10.1155/2015/767423 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Tao Xiang ◽

Tao Li ◽

Mao Ye ◽

Zijian Liu

Keyword(s):

Computer Vision ◽

Random Forest ◽

Classification Accuracy ◽

Template Matching ◽

Detection Method ◽

State Of The Art ◽

Pedestrian Detection ◽

Sliding Window ◽

Experimental Results ◽

Training Samples

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.

Download Full-text

Pedestrian Detection Based on Histograms of Oriented Gradients in ROI

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.542-543.937 ◽

2012 ◽

Vol 542-543 ◽

pp. 937-940

Author(s):

Ping Shu Ge ◽

Guo Kai Xu ◽

Xiu Chun Zhao ◽

Peng Song ◽

Lie Guo

Keyword(s):

Traditional Method ◽

Detection Method ◽

Feature Vector ◽

Pedestrian Detection ◽

Region Of Interest ◽

Experimental Results ◽

Process Time ◽

Vertical Edge ◽

Histograms Of Oriented Gradients

To locate pedestrian faster and more accurately, a pedestrian detection method based on histograms of oriented gradients (HOG) in region of interest (ROI) is introduced. The features are extracted in the ROI where the pedestrian's legs may exist, which is helpful to decrease the dimension of feature vector and simplify the calculation. Then the vertical edge symmetry of pedestrian's legs is fused to confirm the detection. Experimental results indicate that this method can achieve an ideal accuracy with lower process time compared to traditional method.

Download Full-text

An Efficient Image-Based Method for Detection of Fastener on Railway

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.346.731 ◽

2011 ◽

Vol 346 ◽

pp. 731-737 ◽

Cited By ~ 1

Author(s):

Jin Feng Yang ◽

Man Hua Liu ◽

Hui Zhao ◽

Wei Tao

Keyword(s):

Image Processing ◽

Efficient Method ◽

Template Matching ◽

Detection Method ◽

Experimental Results ◽

Feature Descriptor ◽

Complex Environment ◽

Computation Efficiency ◽

The Status ◽

Direction Field

This paper presents an efficient method to detect the fastener based on the technologies of image processing and optical detection. As feature descriptor, the Direction Field of fastener image is computed for template matching. This fastener detection method can be used to determine the status of fastener on the corresponding track, i.e., whether the fastener is on the track or missing. Experimental results are presented to show that the proposed method is computation efficiency and is robust for fastener detection in complex environment.

Download Full-text

A Feasibility Test on Preventing PRMDs Based on Deep Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110005 ◽

2019 ◽

Vol 33 ◽

pp. 10005-10006

Author(s):

So-Hyun Park ◽

Sun-Young Ihm ◽

Aziz Nasridinov ◽

Young-Ho Park

Keyword(s):

Deep Learning ◽

Musculoskeletal Disorders ◽

Classification Accuracy ◽

State Of The Art ◽

Learning Algorithms ◽

Experimental Results ◽

Video Classification ◽

Learning Paradigm ◽

Feasibility Test ◽

Piano Playing

This study proposes a method to reduce the playing-related musculoskeletal disorders (PRMDs) that often occur among pianists. Specifically, we propose a feasibility test that evaluates several state-of-the-art deep learning algorithms to prevent injuries of pianist. For this, we propose (1) a C3P dataset including various piano playing postures and show (2) the application of four learning algorithms, which demonstrated their superiority in video classification, to the proposed C3P datasets. To our knowledge, this is the first study that attempted to apply the deep learning paradigm to reduce the PRMDs in pianist. The experimental results demonstrated that the classification accuracy is 80% on average, indicating that the proposed hypothesis about the effectiveness of the deep learning algorithms to prevent injuries of pianist is true.

Download Full-text

Document-level Relation Extraction as Semantic Segmentation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/551 ◽

2021 ◽

Author(s):

Ningyu Zhang ◽

Xiang Chen ◽

Xin Xie ◽

Shumin Deng ◽

Chuanqi Tan ◽

...

Keyword(s):

Computer Vision ◽

State Of The Art ◽

Relation Extraction ◽

Semantic Segmentation ◽

Experimental Results ◽

Context Information ◽

Global Information ◽

Benchmark Datasets ◽

Segmentation Task ◽

Document Level

Document-level relation extraction aims to extract relations among multiple entity pairs from a document. Previously proposed graph-based or transformer-based models utilize the entities independently, regardless of global information among relational triples. This paper approaches the problem by predicting an entity-level relation matrix to capture local and global information, parallel to the semantic segmentation task in computer vision. Herein, we propose a Document U-shaped Network for document-level relation extraction. Specifically, we leverage an encoder module to capture the context information of entities and a U-shaped segmentation module over the image-style feature map to capture global interdependency among triples. Experimental results show that our approach can obtain state-of-the-art performance on three benchmark datasets DocRED, CDR, and GDA.

Download Full-text

An Efficient Pedestrian Detection Method Based on YOLOv2

Mathematical Problems in Engineering ◽

10.1155/2018/3518959 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Zhongmin Liu ◽

Zhicai Chen ◽

Zhanming Li ◽

Wenjin Hu

Keyword(s):

Detection Method ◽

Size Range ◽

State Of The Art ◽

Pedestrian Detection ◽

Semantic Segmentation ◽

Detection Methods ◽

Detection Model ◽

Great Progress ◽

Good Trade ◽

Speed And Accuracy

In recent years, techniques based on the deep detection model have achieved overwhelming improvements in the accuracy of detection, which makes them being the most adapted for the applications, such as pedestrian detection. However, speed and accuracy are a pair of contradictions that always exist and have long puzzled researchers. How to achieve the good trade-off between them is a problem we must consider while designing the detectors. To this end, we employ the general detector YOLOv2, a state-of-the-art method in the general detection tasks, in the pedestrian detection. Then we modify the network parameters and structures, according to the characteristics of the pedestrians, making this method more suitable for detecting pedestrians. Experimental results in INRIA pedestrian detection dataset show that it has a fairly high detection speed with a small precision gap compared with the state-of-the-art pedestrian detection methods. Furthermore, we add weak semantic segmentation networks after shared convolution layers to illuminate pedestrians and employ a scale-aware structure in our model according to the characteristics of the wide size range in Caltech pedestrian detection dataset, which make great progress under the original improvement.

Download Full-text

An Efficient Tongue Segmentation Model Based on U-Net Framework

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421540355 ◽

2021 ◽

Author(s):

Qunsheng Ruan ◽

Qingfeng Wu ◽

Junfeng Yao ◽

Yingdong Wang ◽

Hsien-Wei Tseng ◽

...

Keyword(s):

Loss Function ◽

Loss Rate ◽

State Of The Art ◽

Experimental Results ◽

Cross Entropy ◽

Model Based ◽

Training Samples ◽

Net Framework ◽

Tongue Segmentation

In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we propose an efficient tongue segmentation model based on U-Net. Three important studies are launched, including optimizing the model’s main network, innovating a new network to specially handle tongue edge cutting and proposing a weighted binary cross-entropy loss function. The purpose of optimizing the tongue image main segmentation network is to make the model recognize the foreground and background features for the tongue image as well as possible. A novel tongue edge segmentation network is used to focus on handling the tongue edge because the edge of the tongue contains a number of important information. Furthermore, the advantageous loss function proposed is to be adopted to enhance the pixel supervision corresponding to tongue images. Moreover, thanks to a lack of tongue image resources on Traditional Chinese Medicine (TCM), some special measures are adopted to augment training samples. Various comparing experiments on two datasets were conducted to verify the performance of the segmentation model. The experimental results indicate that the loss rate of our model converges faster than the others. It is proved that our model has better stability and robustness of segmentation for tongue image from poor environment. The experimental results also indicate that our model outperforms the state-of-the-art ones in aspects of the two most important tongue image segmentation indexes: IoU and Dice. Moreover, experimental results on augmentation samples demonstrate our model have better performances.

Download Full-text

Application of deep learning and computer vision frameworks for solving video context prediction problem

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2016.02-03.164 ◽

2016 ◽

pp. 164-169

Author(s):

D. Voloshyn ◽

Keyword(s):

Computer Vision ◽

Deep Learning ◽

State Of The Art ◽

Experimental Results ◽

Prediction Problem ◽

Detection Problem ◽

Application Architecture ◽

Context Detection

Authors describe an application for solving video context detection problem. Application architecture use state-of-the-art deap learning TensorFlow framework together with the computer vision library OpenCV in isolated agent environment. The experimental results are shown to demonstrate the effectiveness of developed product.

Download Full-text

Feature Prioritization and Regularization Improve Standard Accuracy and Adversarial Robustness

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/415 ◽

2019 ◽

Author(s):

Chihuang Liu ◽

Joseph JaJa

Keyword(s):

Classification Accuracy ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Trade Off ◽

Experimental Strategy ◽

Art Methods ◽

Adversarial Training ◽

Standard Classification ◽

Improve Standard

Adversarial training has been successfully applied to build robust models at a certain cost. While the robustness of a model increases, the standard classification accuracy declines. This phenomenon is suggested to be an inherent trade-off. We propose a model that employs feature prioritization by a nonlinear attention module and L2 feature regularization to improve the adversarial robustness and the standard accuracy relative to adversarial training. The attention module encourages the model to rely heavily on robust features by assigning larger weights to them while suppressing non-robust features. The regularizer encourages the model to extract similar features for the natural and adversarial images, effectively ignoring the added perturbation. In addition to evaluating the robustness of our model, we provide justification for the attention module and propose a novel experimental strategy that quantitatively demonstrates that our model is almost ideally aligned with salient data characteristics. Additional experimental results illustrate the power of our model relative to the state of the art methods.

Download Full-text

Fabric Defect Detection based on Improved Faster RCNN

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12402 ◽

2021 ◽

Vol 12 (04) ◽

pp. 23-32

Author(s):

Yuan He ◽

Han-Dong Zhang ◽

Xin-Yue Huang ◽

Francis Eng Hock Tay

Keyword(s):

Neural Network ◽

Computer Vision ◽

Production Process ◽

Product Quality ◽

Defect Detection ◽

Classification Accuracy ◽

Detection Method ◽

Fabric Defect Detection ◽

Fabric Defect ◽

Computer Vision Technology

In the production process of fabric, defect detection plays an important role in the control of product quality. Consider that traditional manual fabric defect detection method are time-consuming and inaccuracy, utilizing computer vision technology to automatically detect fabric defects can better fulfill the manufacture requirement. In this project, we improved Faster RCNN with convolutional block attention module (CBAM) to detect fabric defects. Attention module is introduced from graph neural network, it can infer the attention map from the intermediate feature map and multiply the attention map to adaptively refine the feature. This method improve the performance of classification and detection without increase the computation-consuming. The experiment results show that Faster RCNN with attention module can efficient improve the classification accuracy.

Download Full-text

A scale-aware YOLO model for pedestrian detection

10.36227/techrxiv.13049129.v1 ◽

2020 ◽

Author(s):

Xingyi Yang ◽

Yonghu Wang ◽

Robert Laganiere

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Real Time ◽

State Of The Art ◽

Pedestrian Detection ◽

Small Scale ◽

Robust Detection ◽

Public Datasets ◽

Traditional Approaches ◽

New Framework

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>

Download Full-text