Fast and Effective Copy-Move Detection of Digital Audio Based on Auto Segment

2019 ◽  
Vol 11 (2) ◽  
pp. 47-62 ◽  
Author(s):  
Xinchao Huang ◽  
Zihan Liu ◽  
Wei Lu ◽  
Hongmei Liu ◽  
Shijun Xiang

Detecting digital audio forgeries is a significant research focus in the field of audio forensics. In this article, the authors focus on a special form of digital audio forgery—copy-move—and propose a fast and effective method to detect doctored audios. First, the article segments the input audio data into syllables by voice activity detection and syllable detection. Second, the authors select the points in the frequency domain as feature by applying discrete Fourier transform (DFT) to each audio segment. Furthermore, this article sorts every segment according to the features and gets a sorted list of audio segments. In the end, the article merely compares one segment with some adjacent segments in the sorted list so that the time complexity is decreased. After comparisons with other state of the art methods, the results show that the proposed method can identify the authentication of the input audio and locate the forged position fast and effectively.

Author(s):  
Xinchao Huang ◽  
Zihan Liu ◽  
Wei Lu ◽  
Hongmei Liu ◽  
Shijun Xiang

Detecting digital audio forgeries is a significant research focus in the field of audio forensics. In this article, the authors focus on a special form of digital audio forgery—copy-move—and propose a fast and effective method to detect doctored audios. First, the article segments the input audio data into syllables by voice activity detection and syllable detection. Second, the authors select the points in the frequency domain as feature by applying discrete Fourier transform (DFT) to each audio segment. Furthermore, this article sorts every segment according to the features and gets a sorted list of audio segments. In the end, the article merely compares one segment with some adjacent segments in the sorted list so that the time complexity is decreased. After comparisons with other state of the art methods, the results show that the proposed method can identify the authentication of the input audio and locate the forged position fast and effectively.


2016 ◽  
Vol 28 (1) ◽  
pp. 138-144
Author(s):  
Horderlin Vrangel Robles ◽  
Valentin Molina ◽  
Luis Martinez ◽  
Hermann Davila

The results obtained after comparing several algorithms which use basic methods of signal processing for speech activity detection of voice or VAD (Voice Activity Detection-VAD), were assessed in order to determine their effectiveness. The algorithms presented in this article are short-time or spectral energy based endpoint detection algorithm, the zero crossing rate method, and the higher order differential (High Order Difference, HOD) method. First, an introduction of the concept of VAD is presented and the need to apply such language algorithms in River Plate is Spanish. Then a summary of the state of the art techniques and algorithms for detecting voice activity is shown with evidence and experiments used to implement algorithms with BEPPA corpus (Evaluation Battery for Patients with Auditive Prostheses, BEPPA – in Spanish).


2019 ◽  
Author(s):  
Ruixi Lin ◽  
Charles Costello ◽  
Charles Jankowski ◽  
Vishwas Mruthyunjaya

2021 ◽  
Vol 175 ◽  
pp. 107832
Author(s):  
Joaquín García-Gómez ◽  
Roberto Gil-Pita ◽  
Miguel Aguilar-Ortega ◽  
Manuel Utrilla-Manso ◽  
Manuel Rosa-Zurera ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


Author(s):  
R. V. Prasad ◽  
R. Muralishankar ◽  
S. Vijay ◽  
H. N. Shankar ◽  
Przemyslaw Pawelczak ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document