scholarly journals Voice Feature Extraction for Gender and Emotion Recognition

Author(s):  
Vani Nair ◽  
Pooja Pillai ◽  
Anupama Subramanian ◽  
Sarah Khalife ◽  
Dr. Madhu Nashipudimath

Voice recognition plays a key role in spoken communication that helps to identify the emotions of a person that reflects in the voice. Gender classification through speech is a widely used Human Computer Interaction (HCI) as it is not easy to identify gender by computer. This led to the development of a model for “Voice feature extraction for Emotion and Gender Recognition”. The speech signal consists of semantic information, speaker information (gender, age, emotional state), accompanied by noise. Females and males have different voice characteristics due to their acoustical and perceptual differences along with a variety of emotions which convey their own unique perceptions. In order to explore this area, feature extraction requires pre- processing of data, which is necessary for increasing the accuracy. The proposed model follows steps such as data extraction, pre- processing using Voice Activity Detector (VAD), feature extraction using Mel-Frequency Cepstral Coefficient (MFCC), feature reduction by Principal Component Analysis (PCA) and Support Vector Machine (SVM) classifier. The proposed combination of techniques produced better results which can be useful in the healthcare sector, virtual assistants, security purposes and other fields related to the Human Machine Interaction domain. 

2021 ◽  
Vol 40 ◽  
pp. 03008
Author(s):  
Madhu M. Nashipudimath ◽  
Pooja Pillai ◽  
Anupama Subramanian ◽  
Vani Nair ◽  
Sarah Khalife

Voice recognition plays a key function in spoken communication that facilitates identifying the emotions of a person that reflects within the voice. Gender classification through speech is a popular Human Computer Interaction (HCI) method on account that determining gender through computer is hard. This led to the development of a model for "Voice feature extraction for Emotion and Gender Recognition". The speech signal consists of semantic information, speaker information (gender, age, emotional state), accompanied by noise. Females and males have specific vocal traits because of their acoustical and perceptual variations along with a variety of emotions which bring their own specific perceptions. In order to explore this area, feature extraction requires pre-processing of data, which is necessary for increasing the accuracy. The proposed model follows steps such as data extraction, pre-processing using Voice Activity Detector(VAD), feature extraction using Mel-Frequency Cepstral Coefficient(MFCC), feature reduction by Principal Component Analysis(PCA) and Support Vector Machine (SVM) classifier. The proposed combination of techniques produced better results which can be useful in healthcare sector, virtual assistants, security purposes and other fields related to Human Machine Interaction domain.


2021 ◽  
Author(s):  
SANTI BEHERA ◽  
PRABIRA SETHY

Abstract The skin is the main organ. It is approximately 8 pounds for the average adult. Our skin is a truly wonderful organ. It isolates us and shields our bodies from hazards. However, the skin is also vulnerable to damage and distracted from its original appearance; brown, black, or blue, or combinations of those colors, known as pigmented skin lesions. These common pigmented skin lesions (CPSL) are the leading factor of skin cancer, or can say these are the primary causes of skin cancer. In the healthcare sector, the categorization of CPSL is the main problem because of inaccurate outputs, overfitting, and higher computational costs. Hence, we proposed a classification model based on multi-deep feature and support vector machine (SVM) for the classification of CPSL. The proposed system comprises two phases: first, evaluate the 11 CNN model's performance in the deep feature extraction approach with SVM. Then, concatenate the top performed three CNN model's deep features and with the help of SVM to categorize the CPSL. In the second step, 8192 and 12288 features are obtained by combining binary and triple networks of 4096 features from the top performed CNN model. These features are also given to the SVM classifiers. The SVM results are also evaluated with principal component analysis (PCA) algorithm to the combined feature of 8192 and 12288. The highest results are obtained with 12288 features. The experimentation results, the combination of the deep feature of Alexnet, VGG16 & VGG19, achieved the highest accuracy of 91.7% using SVM classifier. As a result, the results show that the proposed methods are a useful tool for CPSL classification.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2018 ◽  
Vol 10 (7) ◽  
pp. 1123 ◽  
Author(s):  
Yuhang Zhang ◽  
Hao Sun ◽  
Jiawei Zuo ◽  
Hongqi Wang ◽  
Guangluan Xu ◽  
...  

Aircraft type recognition plays an important role in remote sensing image interpretation. Traditional methods suffer from bad generalization performance, while deep learning methods require large amounts of data with type labels, which are quite expensive and time-consuming to obtain. To overcome the aforementioned problems, in this paper, we propose an aircraft type recognition framework based on conditional generative adversarial networks (GANs). First, we design a new method to precisely detect aircrafts’ keypoints, which are used to generate aircraft masks and locate the positions of the aircrafts. Second, a conditional GAN with a region of interest (ROI)-weighted loss function is trained on unlabeled aircraft images and their corresponding masks. Third, an ROI feature extraction method is carefully designed to extract multi-scale features from the GAN in the regions of aircrafts. After that, a linear support vector machine (SVM) classifier is adopted to classify each sample using their features. Benefiting from the GAN, we can learn features which are strong enough to represent aircrafts based on a large unlabeled dataset. Additionally, the ROI-weighted loss function and the ROI feature extraction method make the features more related to the aircrafts rather than the background, which improves the quality of features and increases the recognition accuracy significantly. Thorough experiments were conducted on a challenging dataset, and the results prove the effectiveness of the proposed aircraft type recognition framework.


Author(s):  
Ke Li ◽  
Yalei Wu ◽  
Shimin Song ◽  
Yi sun ◽  
Jun Wang ◽  
...  

The measurement of spacecraft electrical characteristics and multi-label classification issues are generally including a large amount of unlabeled test data processing, high-dimensional feature redundancy, time-consumed computation, and identification of slow rate. In this paper, a fuzzy c-means offline (FCM) clustering algorithm and the approximate weighted proximal support vector machine (WPSVM) online recognition approach have been proposed to reduce the feature size and improve the speed of classification of electrical characteristics in the spacecraft. In addition, the main component analysis for the complex signals based on the principal component feature extraction is used for the feature selection process. The data capture contribution approach by using thresholds is furthermore applied to resolve the selection problem of the principal component analysis (PCA), which effectively guarantees the validity and consistency of the data. Experimental results indicate that the proposed approach in this paper can obtain better fault diagnosis results of the spacecraft electrical characteristics’ data, improve the accuracy of identification, and shorten the computing time with high efficiency.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2020 ◽  
Vol 17 (4) ◽  
pp. 572-578
Author(s):  
Mohammad Parseh ◽  
Mohammad Rahmanimanesh ◽  
Parviz Keshavarzi

Persian handwritten digit recognition is one of the important topics of image processing which significantly considered by researchers due to its many applications. The most important challenges in Persian handwritten digit recognition is the existence of various patterns in Persian digit writing that makes the feature extraction step to be more complicated.Since the handcraft feature extraction methods are complicated processes and their performance level are not stable, most of the recent studies have concentrated on proposing a suitable method for automatic feature extraction. In this paper, an automatic method based on machine learning is proposed for high-level feature extraction from Persian digit images by using Convolutional Neural Network (CNN). After that, a non-linear multi-class Support Vector Machine (SVM) classifier is used for data classification instead of fully connected layer in final layer of CNN. The proposed method has been applied to HODA dataset and obtained 99.56% of recognition rate. Experimental results are comparable with previous state-of-the-art methods


2021 ◽  
pp. 6787-6794
Author(s):  
Anisha Rebinth, Dr. S. Mohan Kumar

An automated Computer Aided Diagnosis (CAD) system for glaucoma diagnosis using fundus images is developed. The various glaucoma image classification schemes using the supervised and unsupervised learning approaches are reviewed. The research paper involves three stages of glaucoma disease diagnosis. First, the pre-processing stage the texture features of the fundus image is recorded with a two-dimensional Gabor filter at various sizes and orientations. The image features are generated using higher order statistical characteristics, and then Principal Component Analysis (PCA) is used to select and reduce the dimension of the image features. For the performance study, the Gabor filter based features are extracted from the RIM-ONE and HRF database images, and then Support Vector Machine (SVM) classifier is used for classification. Final stage utilizes the SVM classifier with the Radial Basis Function (RBF) kernel learning technique for the efficient classification of glaucoma disease with accuracy 90%.


2019 ◽  
Vol 9 (15) ◽  
pp. 3130 ◽  
Author(s):  
Navarro ◽  
Perez

Many applications in image analysis require the accurate classification of complex patterns including both color and texture, e.g., in content image retrieval, biometrics, and the inspection of fabrics, wood, steel, ceramics, and fruits, among others. A new method for pattern classification using both color and texture information is proposed in this paper. The proposed method includes the following steps: division of each image into global and local samples, texture and color feature extraction from samples using a Haralick statistics and binary quaternion-moment-preserving method, a classification stage using support vector machine, and a final stage of post-processing employing a bagging ensemble. One of the main contributions of this method is the image partition, allowing image representation into global and local features. This partition captures most of the information present in the image for colored texture classification allowing improved results. The proposed method was tested on four databases extensively used in color–texture classification: the Brodatz, VisTex, Outex, and KTH-TIPS2b databases, yielding correct classification rates of 97.63%, 97.13%, 90.78%, and 92.90%, respectively. The use of the post-processing stage improved those results to 99.88%, 100%, 98.97%, and 95.75%, respectively. We compared our results to the best previously published results on the same databases finding significant improvements in all cases.


Sign in / Sign up

Export Citation Format

Share Document