cepstral coefficients
Recently Published Documents


TOTAL DOCUMENTS

397
(FIVE YEARS 114)

H-INDEX

26
(FIVE YEARS 4)

Author(s):  
Murugaiya Ramashini ◽  
P. Emeroylariffion Abas ◽  
Kusuma Mohanchandra ◽  
Liyanage C. De Silva

Birds are excellent environmental indicators and may indicate sustainability of the ecosystem; birds may be used to provide provisioning, regulating, and supporting services. Therefore, birdlife conservation-related researches always receive centre stage. Due to the airborne nature of birds and the dense nature of the tropical forest, bird identifications through audio may be a better solution than visual identification. The goal of this study is to find the most appropriate cepstral features that can be used to classify bird sounds more accurately. Fifteen (15) endemic Bornean bird sounds have been selected and segmented using an automated energy-based algorithm. Three (3) types of cepstral features are extracted; linear prediction cepstrum coefficients (LPCC), mel frequency cepstral coefficients (MFCC), gammatone frequency cepstral coefficients (GTCC), and used separately for classification purposes using support vector machine (SVM). Through comparison between their prediction results, it has been demonstrated that model utilising GTCC features, with 93.3% accuracy, outperforms models utilising MFCC and LPCC features. This demonstrates the robustness of GTCC for bird sounds classification. The result is significant for the advancement of bird sound classification research, which has been shown to have many applications such as in eco-tourism and wildlife management.


2022 ◽  
Author(s):  
Mahbubeh Bahreini ◽  
Ramin Barati ◽  
Abbas Kamaly

Abstract Early diagnosis is crucial in the treatment of heart diseases. Researchers have applied a variety of techniques for cardiovascular disease diagnosis, including the detection of heart sounds. It is an efficient and affordable diagnosis technique. Body organs, including the heart, generate several sounds. These sounds are different in different individuals. A number of methodologies have been recently proposed to detect and diagnose normal/abnormal sounds generated by the heart. The present study proposes a technique on the basis of the Mel-frequency cepstral coefficients, fractal dimension, and hidden Markov model. It uses the fractal dimension to identify sounds S1 and S2. Then, the Mel-frequency cepstral coefficients and the first- and second-order difference Mel-frequency cepstral coefficients are employed to extract the features of the signals. The adaptive Hemming window length is a major advantage of the methodology. The S1-S2 interval determines the adaptive length. Heart sounds are divided into normal and abnormal through the improved hidden Markov model and Baum-Welch and Viterbi algorithms. The proposed framework is evaluated using a number of datasets under various scenarios.


2021 ◽  
Vol 1 (1) ◽  
pp. 335-354
Author(s):  
Heriyanto Heriyanto ◽  
Dyah Ayu Irawati

Voice research for feature extraction using MFCC. Introduction with feature extraction as the first step to get features. Features need to be done further through feature selection. The feature selection in this research used the Dominant Weight feature for the Shahada voice, which produced frames and cepstral coefficients as the feature extraction. The cepstral coefficient was used from 0 to 23 or 24 cepstral coefficients. At the same time, the taken frame consisted of 0 to 10 frames or eleven frames. Voting as many as 300 samples of recorded voices were tested on 200 voices of both male and female voice recordings. The frequency used was 44.100 kHz 16-bit stereo. This research aimed to gain accuracy by selecting the right features on the frame using MFCC feature extraction and matching accuracy with frame feature selection using the Dominant Weight Normalization (NBD). The accuracy results obtained that the MFCC method with the selection of the 9th frame had a higher accuracy rate of 86% compared to other frames. The MFCC without feature selection had an average of 60%. The conclusion was that selecting the right features in the 9th frame impacted the accuracy of the voice of shahada recitation.


Author(s):  
Adwait Patil

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid


Sign in / Sign up

Export Citation Format

Share Document