International Journal of Pattern Recognition and Artificial Intelligence
Latest Publications





Published By World Scientific

0218-0014, 0218-0014

Jielu Yan ◽  
MingLiang Zhou ◽  
Jinli Pan ◽  
Meng Yin ◽  
Bin Fang

3D human pose estimation describes estimating 3D articulation structure of a person from an image or a video. The technology has massive potential because it can enable tracking people and analyzing motion in real time. Recently, much research has been conducted to optimize human pose estimation, but few works have focused on reviewing 3D human pose estimation. In this paper, we offer a comprehensive survey of the state-of-the-art methods for 3D human pose estimation, referred to as pose estimation solutions, implementations on images or videos that contain different numbers of people and advanced 3D human pose estimation techniques. Furthermore, different kinds of algorithms are further subdivided into sub-categories and compared in light of different methodologies. To the best of our knowledge, this is the first such comprehensive survey of the recent progress of 3D human pose estimation and will hopefully facilitate the completion, refinement and applications of 3D human pose estimation.

Jiang Chang ◽  
Shengqi Guan

In order to solve the problem of dataset expansion in deep learning tasks such as image classification, this paper proposed an image generation model called Class Highlight Generative Adversarial Networks (CH-GANs). In order to highlight image categories, accelerate the convergence speed of the model and generate true-to-life images with clear categories, first, the image category labels were deconvoluted and integrated into the generator through [Formula: see text] convolution. Second, a novel discriminator that cannot only judge the authenticity of the image but also the image category was designed. Finally, in order to quickly and accurately classify strip steel defects, the lightweight image classification network GhostNet was appropriately improved by modifying the number of network layers and the number of network channels, adding SE modules, etc., and was trained on the dataset expanded by CH-GAN. In the comparative experiments, the average FID of CH-GAN is 7.59; the accuracy of the improved GhostNet is 95.67% with 0.19[Formula: see text]M parameters. The experimental results prove the effectiveness and superiority of the methods proposed in this paper in the generation and classification of strip steel defect images.

Yong Yang ◽  
Young Chun ko

With the rapid development of online e-commerce, traditional collaborative filtering algorithms have the disadvantages of data set reduction and sparse matrix filling cannot meet the requirements of users. This paper takes handicrafts as an example to propose the design and application of handicraft recommendation system based on an improved hybrid algorithm. Based on the theory of e-commerce system, through the traditional collaborative filtering algorithm of users, the personalized e-commerce system of hybrid algorithm is designed and analyzed. The personalized e-commerce system based on hybrid algorithm is further proposed. The component model of the business recommendation system and the specific steps of the improved hybrid algorithm based on user information are given. Finally, an experimental analysis of the improved hybrid algorithm is carried out. The results show that the algorithm can effectively improve the effectiveness and exemption of recommending handicrafts. What’s more, it can reduce the user item ratings of candidate set and improve accuracy of the forecast recommendation.

Fukui Li ◽  
Jingyuan He ◽  
Mingliang Zhou ◽  
Bin Fang

Local search algorithms are widely applied in solving large-scale distributed constraint optimization problem (DCOP). Distributed stochastic algorithm (DSA) is a typical local search algorithm to solve DCOP. However, DSA has some drawbacks including easily falling into local optima and the unfairness of assignment choice. This paper presents a novel local search algorithm named VLSs to solve the issues. In VLSs, sampling according to the probability corresponding to assignment is introduced to enable each agent to choose other promising values. Besides, each agent alternately performs a greedy choice among multiple parallel solutions to reduce the chance of falling into local optima and a variance adjustment mechanism to guide the search into a relatively good initial solution in a periodic manner. We give the proof of variance adjustment mechanism rationality and theoretical explanation of impact of greed among multiple parallel solutions. The experimental results show the superiority of VLSs over state-of-the-art DCOP algorithms.

Cheng Chi ◽  
Shasha Wu ◽  
Luyao Wang ◽  
Yaohua Wu

E-commerce retailers face the challenge to assemble a large number of time-critical picking orders. Common parts-to-picker autonomous intelligent warehouses such as automated vehicle storage and retrieval system and robotic mobile fulfillment system are often a little ill-suited for these prerequisites. A mixed-robotic fulfillment system is a hybrid robot picking system based on multi-device collaboration. It is a fusion innovation of traditional automated vehicle storage and retrieval system and robotic mobile fulfillment system. This paper comprehensively considers the characteristics of the system and customer demand, through the construction of a queuing network model to evaluate the performance of the system. A series of problems such as order service time, throughput capacity, and vehicle quantity configuration are analyzed experimentally. The validity of the model is verified by a simulation model.

Harsh Khatter ◽  
Anil Ahlawat

The internet content increases exponentially day-by-day leading to the pop-up of irrelevant data while searching. Thus, the vast availability of web data requires curation to enhance the results of the search in relevance to searched topics. The proposed F-CapsNet deals with the content curation of web blog data through the novel integration of fuzzy logic with a machine learning algorithm. The input content to be curated is initially pre-processed and seven major features such as sentence position, bigrams, TF-IDF, cosine similarity, sentence length, proper noun score and numeric token are extracted. Then the fuzzy rules are applied to generate the extractive summary. After the extractive curation, the output is passed to the novel capsule network based deep auto-encoder where the abstractive summary is produced. The performance measures such as precision, recall, F1-score, accuracy and specificity are computed and the results are compared with the existing state-of-the-art methods. From the simulations performed, it has been proven that the proposed method for content curation is more efficient than any other method.

Xuyang Han ◽  
Guimei Wang ◽  
Jiehui Liu ◽  
Lijie Yang ◽  
Pingge Zhang

Permanent-magnet direct-drive belt conveyors (PMDDBCs) rotate at high speed most of the time, resulting in a large number of invalid energy consumption. To realize the speed regulation of PMDDBC, it is necessary to clarify the relationship between the belt speed, coal quantity of the conveyor and total power of the system. Based on the BP neural network, this paper establishes the power consumption model of PMDDBC, which is related to coal quantity, belt speed and total power. Furthermore, an improved hybrid algorithm (GACO) that combines the advantages of genetic algorithm (GA) and ant colony optimization (ACO) is proposed to optimize the BP power consumption model. The GACO–BP power consumption model is obtained. The original power consumption model is compared with the GACO–BP power consumption model through experiments. Results demonstrate that the GACO–BP power consumption model reduces various prediction errors, while the optimization ability, prediction accuracy and convergence speed are significantly enhanced. It provides a reliable speed regulation basis for the permanent-magnet direct-drive belt conveyor system and also provides a theoretical reference for energy savings and consumption reduction in the coal industry.

Shuai Liu ◽  
Yuanning Liu ◽  
Xiaodong Zhu ◽  
Jing Liu ◽  
Guang Huo ◽  

In this paper, a two-stage multi-category recognition structure based on texture features is proposed. This method can solve the problem of the decline in recognition accuracy in the scene of lightweight training samples. Besides, the problem of recognition effect different in the same recognition structure caused by the unsteady iris can also be solved. In this paper’s structure, digitized values of the edge shape in the iris texture of the image are set as the texture trend feature, while the differences between the gray values of the image obtained by convolution are set as the grayscale difference feature. Furthermore, the texture trend feature is used in the first-stage recognition. The template category that does not match the tested iris is the elimination category, and the remaining categories are uncertain categories. Whereas, in the second-stage recognition, uncertain categories are adopted to determine the iris recognition conclusion through the grayscale difference feature. Then, the experiment results using the JLU iris library show that the method in this paper can be highly efficient in multi-category heterogeneous iris recognition under lightweight training samples and unsteady state.

Reza Seifi Majdar ◽  
Hassan Ghassemian

Unlabeled samples and transformation matrix are two main parts of unsupervised and semi-supervised feature extraction (FE) algorithms. In this manuscript, a semi-supervised FE method, locality preserving projection in the probabilistic framework (LPPPF), to find a sufficient number of reliable and unmixed unlabeled samples from all classes and constructing an optimal projection matrix is proposed. The LPPPF has two main steps. In the first step, a number of reliable unlabeled samples are selected based on the training samples, spectral features, and spatial information in the probabilistic framework. In this way, the spectral and spatial probability distribution function is calculated for each unlabeled sample. Therefore, the spectral features and spatial information are integrated together with a joint probability distribution function. Finally, a sufficient number of unlabeled samples with the highest joint probability distribution are selected. In the second step, the selected unlabeled samples are applied to construct the transformation matrix based on the spectral and spatial information of the unlabeled samples. The adjacency graph is improved by using new weights based on spectral and spatial information. This method is evaluated on three data sets: Indian Pines, Pavia University, and Kennedy Space Center (KSC) and compared with some recent and well-known supervised, semi-supervised, and unsupervised FE methods. Various experiments demonstrate the efficiency of the LPPPF in comparison with the other FE methods. LPPPF has also considerable performance with limited training samples.

Qunsheng Ruan ◽  
Qingfeng Wu ◽  
Junfeng Yao ◽  
Yingdong Wang ◽  
Hsien-Wei Tseng ◽  

In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we propose an efficient tongue segmentation model based on U-Net. Three important studies are launched, including optimizing the model’s main network, innovating a new network to specially handle tongue edge cutting and proposing a weighted binary cross-entropy loss function. The purpose of optimizing the tongue image main segmentation network is to make the model recognize the foreground and background features for the tongue image as well as possible. A novel tongue edge segmentation network is used to focus on handling the tongue edge because the edge of the tongue contains a number of important information. Furthermore, the advantageous loss function proposed is to be adopted to enhance the pixel supervision corresponding to tongue images. Moreover, thanks to a lack of tongue image resources on Traditional Chinese Medicine (TCM), some special measures are adopted to augment training samples. Various comparing experiments on two datasets were conducted to verify the performance of the segmentation model. The experimental results indicate that the loss rate of our model converges faster than the others. It is proved that our model has better stability and robustness of segmentation for tongue image from poor environment. The experimental results also indicate that our model outperforms the state-of-the-art ones in aspects of the two most important tongue image segmentation indexes: IoU and Dice. Moreover, experimental results on augmentation samples demonstrate our model have better performances.

Sign in / Sign up

Export Citation Format

Share Document