scholarly journals Acoustic Feature Extraction and Optimized Neural Network based Classification for Speaker Recognition

Identifying the person from his or her voice characteristics is an essential trait for human interaction. Automatic speaker recognition (ASR) systems are developed to find the identity of the speaker in the field of forensics, business interactions and law enforcement. It can be achieved by extracting prosodic, linguistic, and acoustic speech characteristics. Furthermore optimized neural network based approaches are reviewed to classify the extracted features. In this paper, literatures are surveyed on recognition of speaker through the neural network using an optimization algorithm that has developed from the previous years for ASR systems. We deliberate different characteristics of ASR arrangements, containing features, neural network based classification, performance metrics and standard evaluation data sets. ASR system is discussed in two parts. The first part illustrates different feature extraction techniques and the second part involves the classification approaches which identify the speaker. We accomplish this evaluation through a comparative analysis of various recognition of speaker approaches and compare the results of the same

Automatic speaker recognition is the process of identification of a person automatically from his/her voices. A robust feature extraction algorithm is required for effective and efficient classification. In this paper, a new method is proposed for identifying the speaker using an artificial neural network. Here mel- frequency cepstral coefficient(MFCC) is used as a feature extraction technique that provides useful features for the recognition process. Using these extracted features value, input samples are then created and finally, classification is performed using Multilayer Perceptron (MLP) which is trained by backpropagation. This proposed method gives an accuracy of 94.44%.


Author(s):  
Satyanand Singh

<span lang="EN-US">Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio sample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metrics</span>


2021 ◽  
pp. 1-11
Author(s):  
Amita Nandal ◽  
Marija Blagojevic ◽  
Danijela Milosevic ◽  
Arvind Dhaka ◽  
Lakshmi Narayan Mishra

This paper proposes a deep learning framework for Covid-19 detection by using chest X-ray images. The proposed method first enhances the image by using fuzzy logic which improvises the pixel intensity and suppresses background noise. This improvement enhances the X-ray image quality which is generally not performed in conventional methods. The pre-processing image enhancement is achieved by modeling the fuzzy membership function in terms of intensity and noise threshold. After this enhancement we use a block based method which divides the image into smooth and detailed regions which forms a feature set for feature extraction. After feature extraction we insert a hashing layer after fully connected layer in the neural network. This hash layer is advantageous in terms of improving the overall accuracy by computing the feature distances effectively. We have used a regularization parameter which minimizes the feature distance between similar samples and maximizes the feature distance between dissimilar samples. Finally, classification is done for detection of Covid-19 infection. The simulation results present a comparison of proposed model with existing methods in terms of some well-known performance indices. Various performance metrics have been analysed such as Overall Accuracy, F-measure, specificity, sensitivity and kappa statistics with values 93.53%, 93.23%, 92.74%, 92.02% and 88.70% respectively for 20:80 training to testing sample ratios; 93.84%, 93.53%, 93.04%, 92.33%, and 91.01% respectively for 50:50 training to testing sample ratios; 95.68%, 95.37%, 94.87%, 94.14%, and 90.74% respectively for 80:20 training to testing sample ratios have been obtained using proposed method and it is observed that the results using proposed method are promising as compared to the conventional methods.


Author(s):  
Xi Li ◽  
Ting Wang ◽  
Shexiong Wang

It draws researchers’ attentions how to make use of the log data effectively without paying much for storing them. In this paper, we propose pattern-based deep learning method to extract the features from log datasets and to facilitate its further use at the reasonable expense of the storage performances. By taking the advantages of the neural network and thoughts to combine statistical features with experts’ knowledge, there are satisfactory results in the experiments on some specified datasets and on the routine systems that our group maintains. Processed on testing data sets, the model is 5%, at least, more likely to outperform its competitors in accuracy perspective. More importantly, its schema unveils a new way to mingle experts’ experiences with statistical log parser.


Author(s):  
James Dallas ◽  
Yifan Weng ◽  
Tulga Ersal

Abstract In this work, a novel combined trajectory planner and tracking controller is developed for autonomous vehicles operating on off-road deformable terrains. Common approaches to trajectory planning and tracking often rely on model-dependent schemes, which utilize a simplified model to predict the impact of control inputs to future vehicle response. However, in an off-road context and especially on deformable terrains, accurately modeling the vehicle response for predictive purposes can be challenging due to the complexity of the tire-terrain interaction and limitations of state-of-the-art terramechanics models in terms of operating conditions, computation time, and continuous differentiability. To address this challenge and improve vehicle safety and performance through more accurate prediction of the plant response, in this paper, a nonlinear model predictive control framework is presented that accounts for terrain deformability explicitly using a neural network terramechanics model for deformable terrains. The utility of the proposed scheme is demonstrated on high fidelity simulations for a notional lightweight military vehicle on soft soil. It is shown that the neural network based controller can outperform a baseline Pacejka model based scheme by improving on performance metrics associated with the cost function. In more severe maneuvers, the neural network based controller can achieve sufficient fidelity as compared to the plant to complete maneuvers that lead to failure for the Pacejka based controller. Finally, it is demonstrated that the proposed framework is conducive to real-time implementability.


2017 ◽  
pp. 1437-1467
Author(s):  
Joydev Hazra ◽  
Aditi Roy Chowdhury ◽  
Paramartha Dutta

Registration of medical images like CT-MR, MR-MR etc. are challenging area for researchers. This chapter introduces a new cluster based registration technique with help of the supervised optimized neural network. Features are extracted from different cluster of an image obtained from clustering algorithms. To overcome the drawback regarding convergence rate of neural network, an optimized neural network is proposed in this chapter. The weights are optimized to increase the convergence rate as well as to avoid stuck in local minima. Different clustering algorithms are explored to minimize the clustering error of an image and extract features from suitable one. The supervised learning method applied to train the neural network. During this training process an optimization algorithm named Genetic Algorithm (GA) is used to update the weights of a neural network. To demonstrate the effectiveness of the proposed method, investigation is carried out on MR T1, T2 data sets. The proposed method shows convincing results in comparison with other existing techniques.


Author(s):  
Jerry Lin ◽  
Rajeev Kumar Pandey ◽  
Paul C.-P. Chao

Abstract This study proposes a reduce AI model for the accurate measurement of the blood pressure (BP). In this study varied temporal periods of photoplethysmography (PPG) waveforms is used as the features for the artificial neural networks to estimate blood pressure. A nonlinear Principal component analysis (PCA) method is used herein to remove the redundant features and determine a set of dominant features which is highly correlated to the Blood pressure (BP). The reduce features-set not only helps to minimize the size of the neural network but also improve the measurement accuracy of the systolic blood pressure (SBP) and diastolic blood pressure (DBP). The designed Neural Network has the 5-input layer, 2 hidden layers (32 nodes each) and 2 output nodes for SBP and DBP, respectively. The NN model is trained by the PPG data sets, acquired from the 96 subjects. The testing regression for the SBP and DBP estimation is obtained as 0.81. The resultant errors for the SBP and DBP measurement are 2.00±6.08 mmHg and 1.87±4.09 mmHg, respectively. According to the Advancement of Medical Instrumentation (AAMI) and British Hypertension Society (BHS) standard, the measured error of ±6.08 mmHg is less than 8 mmHg, which shows that the device performance is in grade “A”.


Sign in / Sign up

Export Citation Format

Share Document