scholarly journals Research on Multimodal Music Emotion Recognition Method Based on Image Sequence

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Zhao Yu

The work of music performance system is to control the light change by identifying the emotional elements of music. Therefore, once the identification error occurs, it will not be able to create a good stage effect. Therefore, a multimodal music emotion recognition method based on image sequence is studied. The emotional characteristics of music are analyzed, including acoustic characteristics, melody characteristics, and audio characteristics, and the feature vector is constructed. The recognition and classification model based on neural network is trained, the weight and threshold of each layer are adjusted, and then the feature vector is input into the trained model to realize the intelligent recognition and classification of multimodal music emotion. The threshold of the starting point range of a specific humming note is given by the center clipping method, which is used to eliminate the low amplitude part of the humming note signal, extract the short-time spectral structure features and envelope features of the pitch, and complete the multimodal music emotion recognition. The results show that the calculated kappa coefficient k is greater than 0.75, which shows that the recognition and classification results are in good agreement with the actual results, and the classification and recognition accuracy is high.

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Chun Huang ◽  
Diao Shen

The music performance system works by identifying the emotional elements of music to control the lighting changes. However, if there is a recognition error, a good stage effect will not be able to create. Therefore, this paper proposes an intelligent music emotion recognition and classification algorithm in the music performance system. The first part of the algorithm is to analyze the emotional features of music, including acoustic features, melody features, and audio features. Then, the three kinds of features are combined together to form a feature vector set. In the latter part of the algorithm, it divides the feature vector set into training samples and test samples. The training samples are trained by using recognition and classification model based on the neural network. And then, the testing samples are input into the trained model, which is aiming to realize the intelligent recognition and classification of music emotion. The result shows that the kappa coefficient k values calculated by the proposed algorithm are greater than 0.75, which indicates that the recognition and classification results are consistent with the actual results, and the accuracy of recognition and classification is high. So, the research purpose is achieved.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1579 ◽  
Author(s):  
Kyoung Ju Noh ◽  
Chi Yoon Jeong ◽  
Jiyoun Lim ◽  
Seungeun Chung ◽  
Gague Kim ◽  
...  

Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labels and the weak generalization of the SER model for an unseen target domain. This study proposes a multi-path and group-loss-based network (MPGLN) for SER to support multi-domain adaptation. The proposed model includes a bidirectional long short-term memory-based temporal feature generator and a transferred feature extractor from the pre-trained VGG-like audio classification model (VGGish), and it learns simultaneously based on multiple losses according to the association of emotion labels in the discrete and dimensional models. For the evaluation of the MPGLN SER as applied to multi-cultural domain datasets, the Korean Emotional Speech Database (KESD), including KESDy18 and KESDy19, is constructed, and the English-speaking Interactive Emotional Dyadic Motion Capture database (IEMOCAP) is used. The evaluation of multi-domain adaptation and domain generalization showed 3.7% and 3.5% improvements, respectively, of the F1 score when comparing the performance of MPGLN SER with a baseline SER model that uses a temporal feature generator. We show that the MPGLN SER efficiently supports multi-domain adaptation and reinforces model generalization.


2021 ◽  
Vol 13 (14) ◽  
pp. 2697
Author(s):  
Bo Liu ◽  
Qi Xiao ◽  
Yuhao Zhang ◽  
Wei Ni ◽  
Zhen Yang ◽  
...  

To address the problem of intelligent recognition of optical ship targets under low-altitude squint detection, we propose an intelligent recognition method based on simulation samples. This method comprehensively considers geometric and spectral characteristics of ship targets and ocean background and performs full link modeling combined with the squint detection atmospheric transmission model. It also generates and expands squint multi-angle imaging simulation samples of ship targets in the visible light band using the expanded sample type to perform feature analysis and modification on SqueezeNet. Shallow and deeper features are combined to improve the accuracy of feature recognition. The experimental results demonstrate that using simulation samples to expand the training set can improve the performance of the traditional k-nearest neighbors algorithm and modified SqueezeNet. For the classification of specific ship target types, a mixed-scene dataset expanded with simulation samples was used for training. The classification accuracy of the modified SqueezeNet was 91.85%. These results verify the effectiveness of the proposed method.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Ahmet Mert ◽  
Hasan Huseyin Celik

Abstract The feasibility of using time–frequency (TF) ridges estimation is investigated on multi-channel electroencephalogram (EEG) signals for emotional recognition. Without decreasing accuracy rate of the valence/arousal recognition, the informative component extraction with low computational cost will be examined using multivariate ridge estimation. The advanced TF representation technique called multivariate synchrosqueezing transform (MSST) is used to obtain well-localized components of multi-channel EEG signals. Maximum-energy components in the 2D TF distribution are determined using TF-ridges estimation to extract instantaneous frequency and instantaneous amplitude, respectively. The statistical values of the estimated ridges are used as a feature vector to the inputs of machine learning algorithms. Thus, component information in multi-channel EEG signals can be captured and compressed into low dimensional space for emotion recognition. Mean and variance values of the five maximum-energy ridges in the MSST based TF distribution are adopted as feature vector. Properties of five TF-ridges in frequency and energy plane (e.g., mean frequency, frequency deviation, mean energy, and energy deviation over time) are computed to obtain 20-dimensional feature space. The proposed method is performed on the DEAP emotional EEG recordings for benchmarking, and the recognition rates are yielded up to 71.55, and 70.02% for high/low arousal, and high/low valence, respectively.


2021 ◽  
Vol 9 (1) ◽  
pp. 28-35
Author(s):  
Mariya Podshivalova ◽  
S. Almrshed

The starting point of research on assessing the innovative capacity of an enterprise is the question of definitions. In this regard, authors initially turned to review of scientific literature on the subject of definitions variety for the term "enterprise innovative capacity". These data show that the wording of this term by both foreign and Russian researchers differs significantly. Authors propose a systematization of approaches to the definition and a corresponding graphical classification model, which highlights the evolutionary, resource, functional and process approaches. Further, a critical analysis of approaches to assessing enterprise innovative capacity is carried out. At the first stage, the content of modern assessment methods was studied, and at the second stage, the mathematical tools used were studied. Authors have formed a graphical representation of critical analysis results and based on it, they have concluded that among the approaches to assessing enterprise innovative capacity, the evolutionary approach should be recognized as promising, and among the methods of quantitative assessment – tools of economic statistics.


Sign in / Sign up

Export Citation Format

Share Document