Intelligent module for recognizing emotions by voice

Speech is the main way of communication for people, and people can receive not only semantic but also emotional information from speech. Recognition of emotions by voice is relevant to areas such as psychological care, security systems development, lie detection, customer relationship analysis, video game development. Because the recognition of emotions by a person is subjective, and therefore inexact and time consuming, there is a need to create software that could solve this problem. The article considers the state of the problem of recognizing human emotions by voice. Modern publications, the approaches used in them, namely models of emotions, data sets, methods of extraction of signs, classifiers are analyzed. It is determined that existing developments have an average accuracy of about 0.75. The general structure of the system of recognition of human emotions by voice is analyzed, the corresponding intellectual module is designed and developed. A Unified Modeling Language (UML) is used to create a component diagram and a class diagram. RAVDESS and TESS datasets were selected as datasets to diversify the training sample. A discrete model of emotions (joy, sadness, anger, disgust, fear, surprise, calm, neutral emotion), MFCC (Mel Frequency Cepstral Coefficients) method for extracting signs, convolutional neural network for classification were used. . The neural network was developed using the TensorFlow and Keras machine learning libraries. The spectrogram and graphs of the audio signal, as well as graphs of accuracy and recognition errors are constructed. As a result of the software implementation of the intelligent module for recognizing emotions by voice, the accuracy of validation has been increased to 0.8.

Download Full-text

Multilayered convolutional neural network-based auto-CODEC for audio signal denoising using mel-frequency cepstral coefficients

Neural Computing and Applications ◽

10.1007/s00521-021-05782-5 ◽

2021 ◽

Author(s):

Shivangi Raj ◽

P. Prakasam ◽

Shubham Gupta

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Audio Signal ◽

Signal Denoising ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text

Infant Crying Classification by Using Genetic Algorithm and Artificial Neural Network

ACTA MEDICA IRANICA ◽

10.18502/acta.v58i10.4916 ◽

2020 ◽

Author(s):

Azadeh Bashiri ◽

Roghaye Hosseinkhani

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Artificial Neural Network ◽

Health Care Providers ◽

Audio Signal ◽

Care Providers ◽

Linear Predictive Coding ◽

Mel Frequency Cepstral Coefficients ◽

Infant Crying ◽

Artificial Neural

Cry as the only way of communication of babies with the surrounding environment can be happened for many reasons such as diseases, suffocation, hunger, cold and heat feeling, pain and etc. So, the analysis and detection of its source are very important for parents and health care providers. So the present study designed with the aim to test the performance of neural networks in the identification of the source of babies crying. The present study combines the genetic algorithm and artificial neural network with (Linear Predictive Coding) LPC and MFCC (Mel-Frequency Cepstral Coefficients) to classify the babies crying. The results of this study indicate the superiority of the proposed method compared to the other previous methods. This method could achieve the highest accuracy in the classification of newborns crying among the previous studies. Developing methods for classification audio signal analysis are promising and can be effectively applied in different areas such as babies crying.

Download Full-text

IoT-Based Bee Swarm Activity Acoustic Classification Using Deep Neural Networks

Sensors ◽

10.3390/s21030676 ◽

2021 ◽

Vol 21 (3) ◽

pp. 676

Author(s):

Andrej Zgank

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Markov Models ◽

Audio Signal ◽

Audio Signals ◽

Mel Frequency Cepstral Coefficients ◽

Animal Activity ◽

The Impact ◽

Acoustic Classification ◽

Swarm Activity

Animal activity acoustic monitoring is becoming one of the necessary tools in agriculture, including beekeeping. It can assist in the control of beehives in remote locations. It is possible to classify bee swarm activity from audio signals using such approaches. A deep neural networks IoT-based acoustic swarm classification is proposed in this paper. Audio recordings were obtained from the Open Source Beehive project. Mel-frequency cepstral coefficients features were extracted from the audio signal. The lossless WAV and lossy MP3 audio formats were compared for IoT-based solutions. An analysis was made of the impact of the deep neural network parameters on the classification results. The best overall classification accuracy with uncompressed audio was 94.09%, but MP3 compression degraded the DNN accuracy by over 10%. The evaluation of the proposed deep neural networks IoT-based bee activity acoustic classification showed improved results if compared to the previous hidden Markov models system.

Download Full-text

Campus Violence Detection Based on Artificial Intelligent Interpretation of Surveillance Video Sequences

Remote Sensing ◽

10.3390/rs13040628 ◽

2021 ◽

Vol 13 (4) ◽

pp. 628

Author(s):

Liang Ye ◽

Tong Liu ◽

Tian Han ◽

Hany Ferdinando ◽

Tapio Seppänen ◽

...

Keyword(s):

Neural Network ◽

Recognition Accuracy ◽

Role Playing ◽

School Bullying ◽

Image Features ◽

Campus Violence ◽

Surveillance Video ◽

Acoustic Features ◽

Mel Frequency Cepstral Coefficients ◽

Violence Detection

Campus violence is a common social phenomenon all over the world, and is the most harmful type of school bullying events. As artificial intelligence and remote sensing techniques develop, there are several possible methods to detect campus violence, e.g., movement sensor-based methods and video sequence-based methods. Sensors and surveillance cameras are used to detect campus violence. In this paper, the authors use image features and acoustic features for campus violence detection. Campus violence data are gathered by role-playing, and 4096-dimension feature vectors are extracted from every 16 frames of video images. The C3D (Convolutional 3D) neural network is used for feature extraction and classification, and an average recognition accuracy of 92.00% is achieved. Mel-frequency cepstral coefficients (MFCCs) are extracted as acoustic features, and three speech emotion databases are involved. The C3D neural network is used for classification, and the average recognition accuracies are 88.33%, 95.00%, and 91.67%, respectively. To solve the problem of evidence conflict, the authors propose an improved Dempster–Shafer (D–S) algorithm. Compared with existing D–S theory, the improved algorithm increases the recognition accuracy by 10.79%, and the recognition accuracy can ultimately reach 97.00%.

Download Full-text

Neural Networks L2-Gain Controller Design for Nonlinear System

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.467-469.1505 ◽

2011 ◽

Vol 467-469 ◽

pp. 1505-1510

Author(s):

Dan Liu ◽

Ni Hong Wang ◽

Gui Ying Li

Keyword(s):

Neural Network ◽

Disturbance Rejection ◽

Differential Inequality ◽

Controller Design ◽

Design Method ◽

General Structure ◽

The Neural Network ◽

Network Weight ◽

Finite Gain ◽

L2 Gain

This paper proposes a new method that it uses the neural network to construct the solution of the Hamiltion-Jacobi inequality (HJ), and it carries on the optimization of the neural network weight using the genetic algorithm. This method causes the Lyapunov function to satisfy the HJ, avoides solving the HJ parital differential inequality, and overcomes the difficulty which the HJ parital differential inequality analysis. Beside this, it proposes a design method of a nonlinear state feedback L2-gain disturbance rejection controller based on HJ, and introduces general structure of L2-gain disturbance rejection controller in the form of neural network. The simulation demonstrates the design of controller is feasible and the closed-loop system ensures a finite gain between the disturbance and the output.

Download Full-text

Neural Network Approach for Classification of Human Emotions from EEG Signal

Engineering Vibration, Communication and Information Processing - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-13-1642-5_27 ◽

2018 ◽

pp. 297-310 ◽

Cited By ~ 2

Author(s):

G. S. Shashi Kumar ◽

Niranjana Sampathila ◽

Harikishan Shetty

Keyword(s):

Neural Network ◽

Eeg Signal ◽

Network Approach ◽

Neural Network Approach ◽

Human Emotions

Download Full-text

Lie detection on pupil size by back propagation neural network

Procedia Computer Science ◽

10.1016/j.procs.2017.11.258 ◽

2017 ◽

Vol 120 ◽

pp. 417-421 ◽

Cited By ~ 3

Author(s):

Fatih Veysel Nurçin ◽

Elbrus Imanov ◽

Ali Işın ◽

Dilber Uzun Ozsahin

Keyword(s):

Neural Network ◽

Pupil Size ◽

Back Propagation ◽

Lie Detection ◽

Back Propagation Neural Network

Download Full-text

SMCS: Automatic Real-Time Classification of Ambient Sounds, Based on a Deep Neural Network and Mel Frequency Cepstral Coefficients

Communications in Computer and Information Science - Applied Technologies ◽

10.1007/978-3-030-42520-3_20 ◽

2020 ◽

pp. 245-253

Author(s):

María José Mora-Regalado ◽

Omar Ruiz-Vivanco ◽

Alexandra González-Eras ◽

Pablo Torres-Carrión

Keyword(s):

Neural Network ◽

Real Time ◽

Deep Neural Network ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients ◽

Real Time Classification

Download Full-text

Truth Identification from EEG Signal by using Convolution neural network: Lie Detection

2020 43rd International Conference on Telecommunications and Signal Processing (TSP) ◽

10.1109/tsp49548.2020.9163497 ◽

2020 ◽

Author(s):

Neeraj Baghel ◽

Divyanshu Singh ◽

Malay Kishore Dutta ◽

Radim Burget ◽

Vojtech Myska

Keyword(s):

Neural Network ◽

Lie Detection ◽

Convolution Neural Network ◽

Eeg Signal

Download Full-text

Abstract P365: Blood Pressure Evaluation From Ppg Signal Analysis and Artificial Neural Network

Hypertension ◽

10.1161/hyp.70.suppl_1.p365 ◽

2017 ◽

Vol 70 (suppl_1) ◽

Author(s):

Francesco Lamonaca ◽

Vitaliano Spagnuolo ◽

Serena De Prisco ◽

Domenico L Carnì ◽

Domenico Grimaldi

Keyword(s):

Neural Network ◽

Blood Pressure ◽

Artificial Neural Network ◽

Audio Signal ◽

Hardware Interface ◽

Validation Procedure ◽

Artificial Neural ◽

Artificial Neural Network Ann ◽

The Time Domain ◽

And Storage

The analysis of the PPG signal in the time domain for the evaluation of the blood pressure (BP) is proposed. Some features extracted from the PPG signal are used to train an Artificial Neural Network (ANN) to determine the function that fit the target systolic and diastolic BP. The data related to the PPG signals and BP used in the analysis are provided by the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC II) database. The pre-analysis of the signal to remove inconsistent data is also proposed. A set of 1750 valid pulse is considered. The 80% of the input samples is used for the training of the network. Instead, the 10% of the input data are used for the validation of the network and 10% for final test of this last. The results show as the error for both the systolic and diastolic BP evaluation is included in the range of ±3 mmHg. Tab.1 shows the results for 20 PPG pulses randomly selected analyzed together with the systolic and diastolic blood pressure furnished by MIMC and evaluated by the trained ANN. Tab.1 experimental results comparing MIMIC and the ANN results. Moreover, a suitable hardware to validate the ANN with the sphygmomanometer is designed and realized. This hardware allows clinicians to collect data according to the requirements of the validation procedure. With the sphygmomanometer the systolic and diastolic values are referred to two different PPG pulses. As a consequence, it is proposed a new hardware interface allowing the synchronized acquisition and storage of the PPG signal and clinician voice. For the validation, the clinician: (i) evaluates the BP on both the arms and assesses that no significant differences occur; (ii) plugs the PPG sensor on the finger of one arm; (iii) starts the recording of both the PPG signal and the audio signal; (iv) evaluates the BP on the other arm with sphygmomanometer and says the systolic and diastolic values when detected. Through suitable post processing algorithm, the Systolic and Diastolic values are associated to the corresponding PPG Pulses. Following this procedure, the dataset to further validate the ANN according the standard is obtained. Once the ANN is validated it will be implemented on smartphone to have always in the pocket a reliable measurement system for Blood Pressure, oximetry and heart rate.

Download Full-text