voice pathology detection
Recently Published Documents


TOTAL DOCUMENTS

72
(FIVE YEARS 22)

H-INDEX

15
(FIVE YEARS 1)

Author(s):  
Fadwa Abakarim ◽  
Abdenbi Abenaou

In this paper, an automatic voice pathology recognition system is realized. The special features are extracted by the Adaptive Orthogonal Transform method, and to provide their statistical properties we calculated the average, variance, skewness and kurtosis values. The classification process uses two models that are widely used as a classification method in the field of signal processing: Support Vector Machine (SVM) and Multilayer Perceptron (MLP). The proposed system is tested by using a German voice database: the Saarbruecken Voice Database (SVD). The experimental results show that the Adaptive Orthogonal Transform method works perfectly with the Multilayer Perceptron Neural Network, which achieved 98.87% accuracy. On the other hand, the combination of the Adaptive Orthogonal Transform method and Support Vector Machine reached 85.79% accuracy.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Lei Geng ◽  
Hongfeng Shan ◽  
Zhitao Xiao ◽  
Wei Wang ◽  
Mei Wei

Abstract Automatic voice pathology detection and classification plays an important role in the diagnosis and prevention of voice disorders. To accurately describe the pronunciation characteristics of patients with dysarthria and improve the effect of pathological voice detection, this study proposes a pathological voice detection method based on a multi-modal network structure. First, speech signals and electroglottography (EGG) signals are mapped from the time domain to the frequency domain spectrogram via a short-time Fourier transform (STFT). The Mel filter bank acts on the spectrogram to enhance the signal’s harmonics and denoise. Second, a pre-trained convolutional neural network (CNN) is used as the backbone network to extract sound state features and vocal cord vibration features from the two signals. To obtain a better classification effect, the fused features are input into the long short-term memory (LSTM) network for voice feature selection and enhancement. The proposed system achieves 95.73% for accuracy with 96.10% F1-score and 96.73% recall using the Saarbrucken Voice Database (SVD); thus, enabling a new method for pathological speech detection.


2021 ◽  
Author(s):  
Fahad Taha AL-Dhief ◽  
Nurul Mu'azzah Abdul Latiff ◽  
Marina Mat Baki ◽  
Nik Noordini Nik Abd. Malik ◽  
Naseer Sabri ◽  
...  

Author(s):  
Vikas Mittal ◽  
R. K. Sharma

A non-invasive cum robust voice pathology detection and classification architecture is proposed in the current manuscript. In place of the conventional feature-based machine learning techniques, a new architecture is proposed herein which initially performs deep learning-based filtering of the input voice signal, followed by a decision-level fusion of deep learning and a non-parametric learner. The efficacy of the proposed technique is verified by performing a comparative study with very recent work on the same dataset but based on different training algorithms.The proposed architecture has five different stages.The results are recorded in terms of nine (9) different classification score indices which are – mean average Precision, sensitivity, specificity, F1 score, accuracy, error, false-positive rate, Matthews Correlation Coefficient, and the Cohen’s Kappa index. The experimental results have shown that the use of machine learning classifier can get at most 96.12% accuracy, while the proposed technique achieved the highest accuracy of 99.14% in comparison to other techniques.


2021 ◽  
Vol 70 ◽  
pp. 102973
Author(s):  
Huijun Ding ◽  
Zixiong Gu ◽  
Peng Dai ◽  
Zhou Zhou ◽  
Lu Wang ◽  
...  

2021 ◽  
Vol 106 ◽  
pp. 107310
Author(s):  
Haydar Ankışhan ◽  
Sıtkı Çağdaş İnam

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Sidra Abid Syed ◽  
Munaf Rashid ◽  
Samreen Hussain ◽  
Hira Zahid

Diagnosis on the basis of a computerized acoustic examination may play an incredibly important role in early diagnosis and in monitoring and even improving effective pathological speech diagnostics. Various acoustic metrics test the health of the voice. The precision of these parameters also has to do with algorithms for the detection of speech noise. The idea is to detect the disease pathology from the voice. First, we apply the feature extraction on the SVD dataset. After the feature extraction, the system input goes into the 27 neuronal layer neural networks that are convolutional and recurrent neural network. We divided the dataset into training and testing, and after 10 k-fold validation, the reported accuracies of CNN and RNN are 87.11% and 86.52%, respectively. A 10-fold cross-validation is used to evaluate the performance of the classifier. On a Linux workstation with one NVidia Titan X GPU, program code was written in Python using the TensorFlow package.


2021 ◽  
Vol 11 (8) ◽  
pp. 3450
Author(s):  
Ziqi Fan ◽  
Yuanbo Wu ◽  
Changwei Zhou ◽  
Xiaojun Zhang ◽  
Zhi Tao

The Massachusetts Eye and Ear Infirmary (MEEI) database is an international-standard training database for voice pathology detection (VPD) systems. However, there is a class-imbalanced distribution in normal and pathological voice samples and different types of pathological voice samples in the MEEI database. This study aimed to develop a VPD system that uses the fuzzy clustering synthetic minority oversampling technique algorithm (FC-SMOTE) to automatically detect and classify four types of pathological voices in a multi-class imbalanced database. The proposed FC-SMOTE algorithm processes the initial class-imbalanced dataset. A set of machine learning models was evaluated and validated using the resulting class-balanced dataset as an input. The effectiveness of the VPD system with FC-SMOTE was further verified by an external validation set and another pathological voice database (Saarbruecken Voice Database (SVD)). The experimental results show that, in the multi-classification of pathological voice for the class-imbalanced dataset, the method we propose can significantly improve the diagnostic accuracy. Meanwhile, FC-SMOTE outperforms the traditional imbalanced data oversampling algorithms, and it is preferred for imbalanced voice diagnosis in practical applications.


Author(s):  
Vengateshwaran M ◽  
Gowsalya N ◽  
Atchaya K ◽  
Nivetha R

Nowadays, the use of mobile application is most important thing in the healthcare sector is increasing rapidly. Mobile technologies not only for communication for multimedia content (e.g. clinical audio-visual notes and medical records) but also promising solutions for people who desire the identification, monitoring, and treatment of their health conditions anywhere and at any time. Mobile E-healthcare systems can contribute to make patient care faster, better, and cheaper. Several pathological conditions can benefit from the use of mobile technologies. In this paper we focus on dysphonia, an alteration of the voice quality that affects about one person in three at least once in his/her lifetime. Voice disorders are rapidly spreading, although they are often underestimated. Mobile health systems can be an easy and fast support to voice pathology detection. The identification of an algorithm that discriminates between pathological and healthy voices with more accuracy is necessary to realize a valid and precise mobile health system. . This technique is evaluated by based on experimental results deep neural networks with machine learning approach to provide an accuracy of 99.89% in detecting voice. In this field to detect any abnormal structure and analysis without human intervention in health care sector to enhance the utility of well beginning system.


Author(s):  
Jeberson Retna Raj ◽  
J Jabez ◽  
S Senduru Srinivasulu ◽  
S Gowri ◽  
J S Vimali

Sign in / Sign up

Export Citation Format

Share Document