audio signal Latest Research Papers

Automatic Speech Classifier for Mild Cognitive Impairment and Early Dementia

ACM Transactions on Computing for Healthcare ◽

10.1145/3469089 ◽

2022 ◽

Vol 3 (1) ◽

pp. 1-11

Author(s):

Flavio Bertini ◽

Davide Allevi ◽

Gianluca Lutero ◽

Danilo Montesi ◽

Laura Calzà

Keyword(s):

Data Augmentation ◽

Audio Signal ◽

World Health ◽

Screening Tests ◽

Neuropsychological Screening ◽

Slowing Down ◽

Neurodegenerative Process ◽

Automated Method ◽

Audio Files ◽

Health Organization

The World Health Organization estimates that 50 million people are currently living with dementia worldwide and this figure will almost triple by 2050. Current pharmacological treatments are only symptomatic, and drugs or other therapies are ineffective in slowing down or curing the neurodegenerative process at the basis of dementia. Therefore, early detection of cognitive decline is of the utmost importance to respond significantly and deliver preventive interventions. Recently, the researchers showed that speech alterations might be one of the earliest signs of cognitive defect, observable well in advance before other cognitive deficits become manifest. In this article, we propose a full automated method able to classify the audio file of the subjects according to the progress level of the pathology. In particular, we trained a specific type of artificial neural network, called autoencoder, using the visual representation of the audio signal of the subjects, that is, the spectrogram. Moreover, we used a data augmentation approach to overcome the problem of the large amount of annotated data usually required during the training phase, which represents one of the most major obstacles in deep learning. We evaluated the proposed method using a dataset of 288 audio files from 96 subjects: 48 healthy controls and 48 cognitively impaired participants. The proposed method obtained good classification results compared to the state-of-the-art neuropsychological screening tests and, with an accuracy of 90.57%, outperformed the methods based on manual transcription and annotation of speech.

1-Dimensional Polynomial Neural Networks for audio signal related problems

Knowledge-Based Systems ◽

10.1016/j.knosys.2022.108174 ◽

2022 ◽

pp. 108174

Author(s):

Habib Ben Abdallah ◽

Christopher J. Henry ◽

Sheela Ramanna

Keyword(s):

Neural Networks ◽

Audio Signal

Intelligent Audio Signal Processing for Detecting Rainforest Species Using Deep Learning

Intelligent Automation & Soft Computing ◽

10.32604/iasc.2022.019811 ◽

2022 ◽

Vol 31 (2) ◽

pp. 693-706

Author(s):

Rakesh Kumar ◽

Meenu Gupta ◽

Shakeel Ahmed ◽

Abdulaziz Alhumam ◽

Tushar Aggarwal

Keyword(s):

Signal Processing ◽

Deep Learning ◽

Audio Signal ◽

Audio Signal Processing

A mesh network case study for digital audio signal processing in Smart Farm

Internet of Things ◽

10.1016/j.iot.2021.100488 ◽

2022 ◽

pp. 100488

Author(s):

João Paulo Lemos Escola ◽

Uender Barbosa de Souza ◽

Rodrigo Capobianco Guido ◽

Ivan Nunes da Silva ◽

Jovander da Silva Freitas ◽

...

Keyword(s):

Signal Processing ◽

Audio Signal ◽

Mesh Network ◽

Digital Audio ◽

Audio Signal Processing

Prototype Implementation of Innovative Braille Translator for the Visually Impaired With Hearing Deficiency

10.4018/978-1-7998-4186-9.ch014 ◽

2022 ◽

pp. 272-290

Author(s):

Soumen Santra ◽

Arpan Deyasi

Keyword(s):

Laboratory Condition ◽

Special Class ◽

Visually Impaired ◽

Audio Signal ◽

External World ◽

Experimental Results ◽

Sensory Organs ◽

Prototype Implementation ◽

Impaired People ◽

Novel Model

Text-to-Braille conversion as well as speech-to-Braille conversion are not available in combined form so far for the visually impaired, and there is tremendous need of a device that can look after this special class of people. The present chapter deals with a novel model that is designed to help both types of impaired people, be it visual problem or related with hearing. The proposal is itself unique and is also supported by experimental results available within the laboratory condition. This device will help people to read from text with their Braille language and will also help to convert the same form to audio signal. Since text and audio are the two main interfaces for any person to communicate with the external world apart from functionalities of sensory organs, the work has relevance. With the help of DANET, the same data, in text or speech form, can be accessed in more than one digital device simultaneously.

Deep Learning Approach for Protecting Voice-Controllable Devices From Laser Attacks

10.4018/978-1-7998-7323-5.ch008 ◽

2022 ◽

pp. 125-142

Author(s):

Vijay Srinivas Srinivas Tida ◽

Raghabendra Shah ◽

Xiali Hei

Keyword(s):

Deep Learning ◽

Audio Signal ◽

Audio Signals ◽

Mel Frequency Cepstral Coefficients ◽

Voice Command ◽

Machine Learning Model ◽

Signal Injection ◽

Controllable Systems ◽

The Voice ◽

Modulated Light

The laser-based audio signal injection can be used for attacking voice controllable systems. An attacker can aim an amplitude-modulated light at the microphone's aperture, and the signal injection acts as a remote voice-command attack on voice-controllable systems. Attackers are using vulnerabilities to steal things that are in the form of physical devices or the form of virtual using making orders, withdrawal of money, etc. Therefore, detection of these signals is important because almost every device can be attacked using these amplitude-modulated laser signals. In this project, the authors use deep learning to detect the incoming signals as normal voice commands or laser-based audio signals. Mel frequency cepstral coefficients (MFCC) are derived from the audio signals to classify the input audio signals. If the audio signals are identified as laser signals, the voice command can be disabled, and an alert can be displayed to the victim. The maximum accuracy of the machine learning model was 100%, and in the real world, it's around 95%.

Classification of Parkinson’s disease patients based on spectrogram using local binary pattern descriptors

Journal of Physics Conference Series ◽

10.1088/1742-6596/2153/1/012014 ◽

2022 ◽

Vol 2153 (1) ◽

pp. 012014

Author(s):

E Gelvez-Almeida ◽

A Váasquez-Coronel ◽

R Guatelli ◽

V Aubin ◽

M Mora

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Extreme Learning Machine ◽

Local Binary Pattern ◽

Audio Signal ◽

Training Time ◽

Network Training ◽

Learning Machine ◽

Hidden Layer ◽

Rate Of Success

Abstract Extreme learning machine is an algorithm that has shown a good performance facing classification and regression problems. It has gained great acceptance by the scientific community due to the simplicity of the model and its sola great generalization capacity. This work proposes the use of extreme learning machine neural networks to carry out the classification between Parkinson’s disease patients and healthy individuals. The descriptor used corresponds to the feature vector generated applying the local binary Pattern algorithm to the grayscale spectrograms. The spectrograms are obtained from the audio signal samples from the considered repository. Experiments are conducted with single hidden layer and multilayer extreme learning machine networks comparing the results of each structure. Results show that hierarchical extreme learning machine with three hidden layers has a better general performance over multilayer extreme learning machine networks and a single hidden layer extreme learning machine. The rate of success obtained is within the ranges presented in the literature. However, the hierarchical network training time is considerably faster compared to multilayer networks of three or two hidden layers.

Musical Information Visualization System

Electronic and Acoustic Engineering ◽

10.20535/2617-0965.eae.228487 ◽

2021 ◽

Vol 4 (4) ◽

Author(s):

Roman O. Yaroshenko

Keyword(s):

Real Time ◽

Information Visualization ◽

Frequency Analysis ◽

Spectral Composition ◽

Audio Signal ◽

Signal Spectrum ◽

Time Data ◽

Visualization System ◽

Music Information ◽

Led Matrix

The visualisation systems are spread widely as personal computer’s software. The system, that are processing audio data are presented in this article. The system visualizes the ratio of spectrum amplitudes and has fixed frequency binding to colours. The technology of audio signals processing by the device and components of the device were considered. For the increasing information processing speed was used 32bit controller and graphic equalizer with seven passbands. Music visualization it is function, that are spread widely in mediaplayer’s software, on a different operation systems. This function shows animated images that are depends on music signal. Images are usually reproduced in the real time mode and synchronized with a played audio-track. Music and visualization are merges in the different kind of art: opera, ballett, music drama or movies. Dependencies of auditory and visual sensations are used for increasing the emotional perseption for ordinary listeners . In the systems, that are currently being actively promoted, are used several tools for personal computers, such as: After Effects – The Audio Spectrum Effect, VSDC Video Editor Free – Audio Spectrum Visualizer, Magic Music Visuals. The software, that are mentioned above, has a one disadvantage: the using of streaming video is not possible with the simultaneous receipt of audio and requires processing and rendering of the resulting video series. The purpose of the work is to determine the features of spectral analysis of music information and taking into account real-time data processing. Propose a variant of the music information visualization system, which displays the spectral composition of music and the amplitude of individual harmonics, and filling the LED-matrix with the appropriate color depending on the amplitude of the audio signal, with the possibility of wireless signal transmission from the music source to the visual effects device. The technology of frequency analysis of the spectrum with estimation of amplitude of spectrum’s components of the musical data, that is arriving on the device is chosen for this project. The method is based on the analysis of the spectrum in the selected frequency bands, which in turn simplifies the function of finding maxima at different frequencies. The proposed variant of the musical information visualization system provides display on the LED-matrix of colors that correspond to the frequencies spectrum’s components in the musical composition. Moreover, the number of involved LEDs is proportional to the ratio of the amplitudes of the signal’s frequency components. The desired result is achieved by using a Fast Fourier Transform and selecting Khan or Heming windows for providing a better analysis results of the signal spectrum. The amplitudes of the individual components of the spectrum are estimated additionally and each frequency band has its own color. The work of the system is to analyze the components of the spectrum and frequency of musical information. This information affects the display of colors on the LED matrix. The using of a 32-bit microcontroller provides sufficient speed of audio signal processing with minimal delays. For the increasing the accuracy and speed up the frequency analysis, the sound range is divided into seven bands. For this purpose was used seven-band graphic equalizer MSGEQ7. Music information is transmitted to the system via Bluetooth, which greatly simplifies the selection and connection of the music data source.

Audio Encryption Algorithm Based on Chen Memristor Chaotic System

Symmetry ◽

10.3390/sym14010017 ◽

2021 ◽

Vol 14 (1) ◽

pp. 17

Author(s):

Wanying Dai ◽

Xiangliang Xu ◽

Xiaoming Song ◽

Guodong Li

Keyword(s):

Fourier Transform ◽

Fast Fourier Transform ◽

Chaotic System ◽

Basis Function ◽

Color Image ◽

Audio Signal ◽

Encryption Algorithm ◽

Audio Signals ◽

Encryption Algorithms ◽

Audio Encryption

The data space for audio signals is large, the correlation is strong, and the traditional encryption algorithm cannot meet the needs of efficiency and safety. To solve this problem, an audio encryption algorithm based on Chen memristor chaotic system is proposed. The core idea of the algorithm is to encrypt the audio signal into the color image information. Most of the traditional audio encryption algorithms are transmitted in the form of noise, which makes it easy to attract the attention of attackers. In this paper, a special encryption method is used to obtain higher security. Firstly, the Fast Walsh–Hadamar Transform (FWHT) is used to compress and denoise the signal. Different from the Fast Fourier Transform (FFT) and the Discrete Cosine Transform (DCT), FWHT has good energy compression characteristics. In addition, compared with that of the triangular basis function of the Fast Fourier Transform, the rectangular basis function of the FWHT can be more effectively implemented in the digital circuit to transform the reconstructed dual-channel audio signal into the R and B layers of the digital image matrix, respectively. Furthermore, a new Chen memristor chaotic system solves the periodic window problems, such as the limited chaos range and nonuniform distribution. It can generate a mask block with high complexity and fill it into the G layer of the color image matrix to obtain a color audio image. In the next place, combining plaintext information with color audio images, interactive channel shuffling can not only weaken the correlation between adjacent samples, but also effectively resist selective plaintext attacks. Finally, the cryptographic block is used for overlapping diffusion encryption to fill the silence period of the speech signal, so as to obtain the ciphertext audio. Experimental results and comparative analysis show that the algorithm is suitable for different types of audio signals, and can resist many common cryptographic analysis attacks. Compared with that of similar audio encryption algorithms, the security index of the algorithm is better, and the efficiency of the algorithm is greatly improved.

Secure audio signal transmission based on color image watermarking

Security and Privacy ◽

10.1002/spy2.185 ◽

2021 ◽

Author(s):

Azza Dandooh ◽

Adel S. El‐Fishawy ◽

Fathi E. Abd El‐Samie ◽

Ezz El‐Din Hemdan

Keyword(s):

Color Image ◽

Image Watermarking ◽

Signal Transmission ◽

Audio Signal ◽

Color Image Watermarking

audio signal
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automatic Speech Classifier for Mild Cognitive Impairment and Early Dementia

1-Dimensional Polynomial Neural Networks for audio signal related problems

Intelligent Audio Signal Processing for Detecting Rainforest Species Using Deep Learning

A mesh network case study for digital audio signal processing in Smart Farm

Prototype Implementation of Innovative Braille Translator for the Visually Impaired With Hearing Deficiency

Deep Learning Approach for Protecting Voice-Controllable Devices From Laser Attacks

Classification of Parkinson’s disease patients based on spectrogram using local binary pattern descriptors

Musical Information Visualization System

Audio Encryption Algorithm Based on Chen Memristor Chaotic System

Secure audio signal transmission based on color image watermarking

Export Citation Format

audio signalRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automatic Speech Classifier for Mild Cognitive Impairment and Early Dementia

1-Dimensional Polynomial Neural Networks for audio signal related problems

Intelligent Audio Signal Processing for Detecting Rainforest Species Using Deep Learning

A mesh network case study for digital audio signal processing in Smart Farm

Prototype Implementation of Innovative Braille Translator for the Visually Impaired With Hearing Deficiency

Deep Learning Approach for Protecting Voice-Controllable Devices From Laser Attacks

Classification of Parkinson’s disease patients based on spectrogram using local binary pattern descriptors

Musical Information Visualization System

Audio Encryption Algorithm Based on Chen Memristor Chaotic System

Secure audio signal transmission based on color image watermarking

audio signal
Recently Published Documents