acoustic features
Recently Published Documents


TOTAL DOCUMENTS

1077
(FIVE YEARS 399)

H-INDEX

45
(FIVE YEARS 7)

2022 ◽  
Vol 40 (1) ◽  
pp. 1-23
Author(s):  
Jiaxing Shen ◽  
Jiannong Cao ◽  
Oren Lederman ◽  
Shaojie Tang ◽  
Alex “Sandy” Pentland

User profiling refers to inferring people’s attributes of interest ( AoIs ) like gender and occupation, which enables various applications ranging from personalized services to collective analyses. Massive nonlinguistic audio data brings a novel opportunity for user profiling due to the prevalence of studying spontaneous face-to-face communication. Nonlinguistic audio is coarse-grained audio data without linguistic content. It is collected due to privacy concerns in private situations like doctor-patient dialogues. The opportunity facilitates optimized organizational management and personalized healthcare, especially for chronic diseases. In this article, we are the first to build a user profiling system to infer gender and personality based on nonlinguistic audio. Instead of linguistic or acoustic features that are unable to extract, we focus on conversational features that could reflect AoIs. We firstly develop an adaptive voice activity detection algorithm that could address individual differences in voice and false-positive voice activities caused by people nearby. Secondly, we propose a gender-assisted multi-task learning method to combat dynamics in human behavior by integrating gender differences and the correlation of personality traits. According to the experimental evaluation of 100 people in 273 meetings, we achieved 0.759 and 0.652 in F1-score for gender identification and personality recognition, respectively.


PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0261151
Author(s):  
Jonna K. Vuoskoski ◽  
Janis H. Zickfeld ◽  
Vinoo Alluri ◽  
Vishnu Moorthigari ◽  
Beate Seibt

The experience often described as feeling moved, understood chiefly as a social-relational emotion with social bonding functions, has gained significant research interest in recent years. Although listening to music often evokes what people describe as feeling moved, very little is known about the appraisals or musical features contributing to the experience. In the present study, we investigated experiences of feeling moved in response to music using a continuous rating paradigm. A total of 415 US participants completed an online experiment where they listened to seven moving musical excerpts and rated their experience while listening. Each excerpt was randomly coupled with one of seven rating scales (perceived sadness, perceived joy, feeling moved or touched, sense of connection, perceived beauty, warmth [in the chest], or chills) for each participant. The results revealed that musically evoked experiences of feeling moved are associated with a similar pattern of appraisals, physiological sensations, and trait correlations as feeling moved by videos depicting social scenarios (found in previous studies). Feeling moved or touched by both sadly and joyfully moving music was associated with experiencing a sense of connection and perceiving joy in the music, while perceived sadness was associated with feeling moved or touched only in the case of sadly moving music. Acoustic features related to arousal contributed to feeling moved only in the case of joyfully moving music. Finally, trait empathic concern was positively associated with feeling moved or touched by music. These findings support the role of social cognitive and empathic processes in music listening, and highlight the social-relational aspects of feeling moved or touched by music.


2022 ◽  
Vol 3 (4) ◽  
pp. 295-307
Author(s):  
Subarna Shakya

Personal computer-based data collection and analysis systems may now be more resilient due to the recent advances in digital signal processing technology. The signal processing approach known as Speaker Recognition, uses the specific information contained in voice waves to automatically identify the speaker. For a single source, this study examines systems that can recognize a wide range of emotional states in speech. Since it offers insight into human brain states, it's a hot issue in the development during the interface between human and computer arrangement for speech processing. Mostly, it is necessary to recognize the emotional state of people in the arrangement. This research analyses an effort to discern various emotional stages such as anger, joy, neutral, fear and sadness by classification methods. The acoustic feature, a measure of unpredictability, is used in conjunction with a non-linear signal quantification approach to identify emotions. The unpredictability of all the emotional signals is included in a feature vector constructed from the calculated entropy measurements. In the next step, the acoustic features through speech signal are used for the training in the proposed neural network that are given to linear discriminator analysis approach for further greater classification with acoustic feature extraction. Besides, this research article compares the proposed work with various modern classifiers such as K- nearest neighbor, support vector machine and linear discriminator approach. Moreover, this proposed algorithm is based on acoustic features in Linear Discriminant Analysis (LDA) with acoustic feature extraction machine algorithm. The great advantage of this proposed algorithm is that it separates negative and positive features of emotions and provides good results during classification. According to the results from efficient cross-validation in the proposed framework, accessible sample of dataset of Emotional Speech, a single-source LDA classifier can recognize emotions in speech signals with above 90 percent of accuracy for various emotional stages.


2022 ◽  
Vol 21 (1) ◽  
Author(s):  
Ning Wang ◽  
Alison Testa ◽  
Barry J. Marshall

Abstract Objective Bowel sounds (BS) carry useful information about gastrointestinal condition and feeding status. Interest in computerized bowel sound-based analysis has grown recently and techniques have evolved rapidly. An important first step for these analyses is to extract BS segments, whilst neglecting silent periods. The purpose of this study was to develop a convolutional neural network-based BS detector able to detect all types of BS with accurate time stamps, and to investigate the effect of food consumption on some acoustic features of BS with the proposed detector. Results Audio recordings from 40 volunteers were collected and a BS dataset consisting of 6700 manually labelled segments was generated for training and testing the proposed BS detector. The detector attained 91.06% and 90.78% accuracy for the validation dataset and across-subject test dataset, respectively, with a well-balanced sensitivity and specificity. The detection rates evaluated on different BS types were also satisfactory. Four acoustic features were evaluated to investigate the food effect. The total duration and spectral bandwidth of BS showed significant differences before and after food consumption, while no significant difference was observed in mean-crossing rate values. Conclusion We demonstrated that the proposed BS detector is effective in detecting all types of BS, and providing an accurate time stamp for each BS. The characteristics of BS types and the effect on detection accuracy is discussed. The proposed detector could have clinical application for post-operative ileus prognosis, and monitoring of food intake.


2022 ◽  
Vol 43 (1) ◽  
pp. 22-31
Author(s):  
Maki Nanahara (Kato) ◽  
Kazumasa Yamamoto ◽  
Seiichi Nakagawa
Keyword(s):  

2022 ◽  
pp. 629-647
Author(s):  
Yosra Abdulaziz Mohammed

Cries of infants can be seen as an indicator of pain. It has been proven that crying caused by pain, hunger, fear, stress, etc., show different cry patterns. The work presented here introduces a comparative study between the performance of two different classification techniques implemented in an automatic classification system for identifying two types of infants' cries, pain, and non-pain. The techniques are namely, Continuous Hidden Markov Models (CHMM) and Artificial Neural Networks (ANN). Two different sets of acoustic features were extracted from the cry samples, those are MFCC and LPCC, the feature vectors generated by each were eventually fed into the classification module for the purpose of training and testing. The results of this work showed that the system based on CDHMM have better performance than that based on ANN. CDHMM gives the best identification rate at 96.1%, which is much higher than 79% of ANN whereby in general the system based on MFCC features performed better than the one that utilizes LPCC features.


Sign in / Sign up

Export Citation Format

Share Document