scholarly journals Long Short-Term Memory Projection Recurrent Neural Network Architectures for Piano’s Continuous Note Recognition

2017 ◽  
Vol 2017 ◽  
pp. 1-7 ◽  
Author(s):  
YuKang Jia ◽  
Zhicheng Wu ◽  
Yanyan Xu ◽  
Dengfeng Ke ◽  
Kaile Su

Long Short-Term Memory (LSTM) is a kind of Recurrent Neural Networks (RNN) relating to time series, which has achieved good performance in speech recogniton and image recognition. Long Short-Term Memory Projection (LSTMP) is a variant of LSTM to further optimize speed and performance of LSTM by adding a projection layer. As LSTM and LSTMP have performed well in pattern recognition, in this paper, we combine them with Connectionist Temporal Classification (CTC) to study piano’s continuous note recognition for robotics. Based on the Beijing Forestry University music library, we conduct experiments to show recognition rates and numbers of iterations of LSTM with a single layer, LSTMP with a single layer, and Deep LSTM (DLSTM, LSTM with multilayers). As a result, the single layer LSTMP proves performing much better than the single layer LSTM in both time and the recognition rate; that is, LSTMP has fewer parameters and therefore reduces the training time, and, moreover, benefiting from the projection layer, LSTMP has better performance, too. The best recognition rate of LSTMP is 99.8%. As for DLSTM, the recognition rate can reach 100% because of the effectiveness of the deep structure, but compared with the single layer LSTMP, DLSTM needs more training time.

2019 ◽  
Vol 30 (01) ◽  
pp. 1950027 ◽  
Author(s):  
Xiuhui Wang ◽  
Wei Qi Yan

Human gait recognition is one of the most promising biometric technologies, especially for unobtrusive video surveillance and human identification from a distance. Aiming at improving recognition rate, in this paper we study gait recognition using deep learning and propose a novel method based on convolutional Long Short-Term Memory (Conv-LSTM). First, we present a variation of Gait Energy Images, i.e. frame-by-frame GEI (ff-GEI), to expand the volume of available Gait Energy Images (GEI) data and relax the constraints of gait cycle segmentation required by existing gait recognition methods. Second, we demonstrate the effectiveness of ff-GEI by analyzing the cross-covariance of one person’s gait data. Then, making use of the temporality of our human gait, we design a novel gait recognition model using Conv-LSTM. Finally, the proposed method is evaluated extensively based on the CASIA Dataset B for cross-view gait recognition, furthermore the OU-ISIR Large Population Dataset is employed to verify its generalization ability. Our experimental results show that the proposed method outperforms other algorithms based on these two datasets. The results indicate that the proposed ff-GEI model using Conv-LSTM, coupled with the new gait representation, can effectively solve the problems related to cross-view gait recognition.


2022 ◽  
Vol 12 (2) ◽  
pp. 735
Author(s):  
Tola Pheng ◽  
Tserenpurev Chuluunsaikhan ◽  
Ga-Ae Ryu ◽  
Sung-Hoon Kim ◽  
Aziz Nasridinov ◽  
...  

In the manufacturing industry, the process capability index (Cpk) measures the level and capability required to improve the processes. However, the Cpk is not enough to represent the process capability and performance of the manufacturing processes. In other words, considering that the smart manufacturing environment can accommodate the big data collected from various facilities, we need to understand the state of the process by comprehensively considering diverse factors contained in the manufacturing. In this paper, a two-stage method is proposed to analyze the process quality performance (PQP) and predict future process quality. First, we propose the PQP as a new measure for representing process capability and performance, which is defined by a composite statistical process analysis of such factors as manufacturing cycle time analysis, process trajectory of abnormal detection, statistical process control analysis, and process capability control analysis. Second, PQP analysis results are used to predict and estimate the stability of the production process using a long short-term memory (LSTM) neural network, which is a deep learning algorithm-based method. The present work compares the LSTM prediction model with the random forest, autoregressive integrated moving average, and artificial neural network models to convincingly demonstrate the effectiveness of our proposed approach. Notably, the LSTM model achieved higher accuracy than the other models.


2021 ◽  
Vol 38 (5) ◽  
pp. 1521-1530
Author(s):  
Yanming Zhao ◽  
Hong Yang ◽  
Guoan Su

In the traditional slow feature analysis (SFA), the expansion of polynomial basis function lacks the support of visual computing theories for primates, and cannot learn the uniform, continuous long short-term features through selective visual mechanism. To solve the defects, this paper designs and implements a slow feature algorithm coupling visual selectivity and multiple long short-term memory networks (LSTMs). Inspired by the visual invariance theory of natural images, this paper replaces the principal component analysis (PCA) of traditional SFA algorithm with myTICA (TICA: topologically independent component analysis) to extract image invariant Gabor basis functions, and initialize the space and series of basis functions. In view of the ability of the LSTM to learn long and short-term features, four LSTM algorithms were constructed to separately predict the long and short-term visual selectivity features of Gabor basis functions from the basis function series, and combine the functions into a new basis function, thereby solving the defect of polynomial prediction algorithms. In addition, a Lipschitz consistency condition was designed, and used to develop an approximate orthogonal pruning technique, which optimizes the prediction basis functions, and constructs a hyper-complete space for the basis function. The performance of our algorithm was evaluated by three metrics and mySFA’s classification method. The experimental results show that our algorithm achieved a good prediction effect on INRIA Holidays dataset, and outshined SFA, graph-based SFA (SFA), TICA, and myTICA in accuracy and feasibility; when the threshold was 6, the recognition rate of our algorithm was 99.98%, and the false accept rate (FAR) and false reject rate (FRR) were both smaller than 0.02%, indicating the strong classification ability of our approach.


Author(s):  
Viet Quoc Huynh ◽  
Quynh Nguyen-Thi-Nhu ◽  
Minh Duc Tran ◽  
Anh Ngoc Le ◽  
Phuoc Thanh Nguyen ◽  
...  

Human emotion plays an important role in communication without language, and it also supports research on human behavior. In addition, electroencephalogram signals have been highly confirmed by researchers for reliability as well as ease of storage and recognition. So, the use of electroencephalogram to identify emotion signals are currently a relatively new field. Many researchers are targeting the key ideas in this research field such as signal preprocessing, feature extraction and algorithm optimization. In this paper, we aim to recognize emotion signals using Long Short Term Memory (LSTM) algorithms. Emotional signals dataset was taken from DEAP database of koelstra authors and associates to serve this research. The research will focus on accuracy and training time, and it will test different architectural types as well as the initials of LSTM. The obtained results show the 3-dimensional cubes's structure has better performance than the 2-dimensional cubes's structure. In addition, our research is also compared with other authors' studies to prove the effectiveness of the classification algorithm.


Sign in / Sign up

Export Citation Format

Share Document