Time and spectral analysis methods with machine learning for the authentication of digital audio recordings

2013 ◽  
Vol 230 (1-3) ◽  
pp. 117-126 ◽  
Author(s):  
Rafal Korycki
2014 ◽  
Vol 283 ◽  
pp. 54-67
Author(s):  
Rafał Korycki ◽  

In the work, the problem of detecting discontinuities in lossily compressed audio recordings was outlined and new methods that can be used to examine the authenticity of digital audio records were presented. The described solutions are based on statistical analysis of the data, calculated on the basis of the value of MDCT coefficients. Designated vectors, consisting of 228 features, were used as the training sequences of two machinę learning algorithms under the supervision of the linear discriminant analysis (LDA) and the support vector machinę (SVM). Detection of multiple compression was both used to detect modification of the recording as well as to reveal traces of montage in digital audio recordings. The effectiveness of the algorithms for the detection of discontinuities was tested on the database of recorded musie consisting of nearly one million MP3 files, specially prepared forthis purpose. The results of the research were discussed in the context of the influence of parameters of the compression on the abiiity to detect interference in digital audio recordings.


Author(s):  
Rashmika Kiran Patole ◽  
Priti Paresh Rege

The field of audio forensics has seen a huge advancement in recent years with an increasing number of techniques used for the analysis of the audio recordings submitted as evidence in legal investigations. Audio forensics involves authentication of the evidentiary audio recordings, which is an important procedure to verify the integrity of audio recordings. This chapter focuses two audio authentication procedures, namely acoustic environment identification and tampering detection. The authors provide a framework for the above-mentioned procedures discussing in detail the methodology and feature sets used in the two tasks. The main objective of this chapter is to introduce the readers to different machine learning algorithms that can be used for environment identification and forgery detection. The authors also provide some promising results that prove the utility of machine learning algorithms in this interesting field.


Author(s):  
E. Yu. Shchetinin

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.


2022 ◽  
Author(s):  
Maria Semeli Frangopoulou ◽  
Maryam Alimardani

Alzheimers disease (AD) is a brain disorder that is mainly characterized by a progressive degeneration of neurons in the brain, causing a decline in cognitive abilities and difficulties in engaging in day-to-day activities. This study compares an FFT-based spectral analysis against a functional connectivity analysis based on phase synchronization, for finding known differences between AD patients and Healthy Control (HC) subjects. Both of these quantitative analysis methods were applied on a dataset comprising bipolar EEG montages values from 20 diagnosed AD patients and 20 age-matched HC subjects. Additionally, an attempt was made to localize the identified AD-induced brain activity effects in AD patients. The obtained results showed the advantage of the functional connectivity analysis method compared to a simple spectral analysis. Specifically, while spectral analysis could not find any significant differences between the AD and HC groups, the functional connectivity analysis showed statistically higher synchronization levels in the AD group in the lower frequency bands (delta and theta), suggesting that the AD patients brains are in a phase-locked state. Further comparison of functional connectivity between the homotopic regions confirmed that the traits of AD were localized in the centro-parietal and centro-temporal areas in the theta frequency band (4-8 Hz). The contribution of this study is that it applies a neural metric for Alzheimers detection from a data science perspective rather than from a neuroscience one. The study shows that the combination of bipolar derivations with phase synchronization yields similar results to comparable studies employing alternative analysis methods.


Sign in / Sign up

Export Citation Format

Share Document