scholarly journals Individual Violin Recognition Method Combining Tonal and Nontonal Features

Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 950
Author(s):  
Qi Wang ◽  
Changchun Bao

Individual recognition among instruments of the same type is a challenging problem and it has been rarely investigated. In this study, the individual recognition of violins is explored. Based on the source–filter model, the spectrum can be divided into tonal content and nontonal content, which reflects the timbre from complementary aspects. The tonal/nontonal gammatone frequency cepstral coefficients (GFCC) are combined to describe the corresponding spectrum contents in this study. In the recognition system, Gaussian mixture models–universal background model (GMM–UBM) is employed to parameterize the distribution of the combined features. In order to evaluate the recognition task of violin individuals, a solo dataset including 86 violins is developed in this study. Compared with other features, the combined features show a better performance in both individual violin recognition and violin grade classification. Experimental results also show the GMM–UBM outperforms the CNN, especially when the training data are limited. Finally, the effect of players on the individual violin recognition is investigated.

Informatics ◽  
2018 ◽  
Vol 5 (3) ◽  
pp. 38 ◽  
Author(s):  
Martin Jänicke ◽  
Bernhard Sick ◽  
Sven Tomforde

Personal wearables such as smartphones or smartwatches are increasingly utilized in everyday life. Frequently, activity recognition is performed on these devices to estimate the current user status and trigger automated actions according to the user’s needs. In this article, we focus on the creation of a self-adaptive activity recognition system based on IMU that includes new sensors during runtime. Starting with a classifier based on GMM, the density model is adapted to new sensor data fully autonomously by issuing the marginalization property of normal distributions. To create a classifier from that, label inference is done, either based on the initial classifier or based on the training data. For evaluation, we used more than 10 h of annotated activity data from the publicly available PAMAP2 benchmark dataset. Using the data, we showed the feasibility of our approach and performed 9720 experiments, to get resilient numbers. One approach performed reasonably well, leading to a system improvement on average, with an increase in the F-score of 0.0053, while the other one shows clear drawbacks due to a high loss of information during label inference. Furthermore, a comparison with state of the art techniques shows the necessity for further experiments in this area.


Author(s):  
DEBASHISH DEV MISHRA ◽  
UTPAL BHATTACHARJEE ◽  
SHIKHAR KUMAR SARMA

The performance of automatic speaker recognition (ASR) system degrades drastically in the presence of noise and other distortions, especially when there is a noise level mismatch between the training and testing environments. This paper explores the problem of speaker recognition in noisy conditions, assuming that speech signals are corrupted by noise. A major problem of most speaker recognition systems is their unsatisfactory performance in noisy environments. In this experimental research, we have studied a combination of Mel Frequency Cepstral Coefficients (MFCC) for feature extraction and Cepstral Mean Normalization (CMN) techniques for speech enhancement. Our system uses a Gaussian Mixture Models (GMM) classifier and is implemented under MATLAB®7 programming environment. The process involves the use of speaker data for both training and testing. The data used for testing is matched up against a speaker model, which is trained with the training data using GMM modeling. Finally, experiments are carried out to test the new model for ASR given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.


Identification of a person’s voice from the different voices is known as speaker recognition. The speech signals of individuals are selected by means of speaker recognition or identification. In this work, an efficient method for speaker recognition is made by using Discrete Wavelet Transform (DWT) features and Gaussian Mixture Models (GMM) for classification is presented. The input speech signal features are decomposed by DWT into subband coefficients. The DWT subband coefficient features are the input for the classification. Classification is made by GMM classifier at 4, 8, 16 and 32 Gaussian component levels. Results show a better accuracy of 96.18% speaker signals using DWT features and GMM classifier


This paper proposes a novel approach that combines the power of generative Gaussian mixture models (GMM) and discriminative support vector machines (SVM). The main objective this paper is to incorporating the GMM super vectors based on SVM classifier for language identification (LID) task. The GMM based LID system to capture all the variations present in phonotactic constraints imposed by the language requires large amount of training data. The Gaussian mixture model (GMM)- universal background model (UBM) modeling require less amount of training data. In GMM-UBM LID system, a language model is created by maximum a posterior (MAP) adaptation of the means of the universal background model (UBM). Here the GMM super vectors are created by concatenating the means of the adapted mixture components from UBM. Then these super vectors are applied to a SVM for classification purpose. In this paper, the performance of GMM-UBM LID system based on SVM is compared with the conventional GMM LID system. Form the performance analysis it is found that GMM-UBM LID system based on SVM is performed well when compared to GMM based LID system.


Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 107 ◽  
Author(s):  
Heoncheol Lee

Multi-robot systems require collective map information on surrounding environments to efficiently cooperate with one another on assigned tasks. This paper addresses the problem of grid map merging to obtain the collective map information in multi-robot systems with unknown initial poses. If inter-robot measurements are not available, the only way to merge the maps is to find and match the overlapping area between maps. This paper proposes a tomographic feature-based map merging method, which can be successfully conducted with relatively small overlapping areas. The first part of the proposed method is to estimate a map transformation matrix using the Radon transform which can extract tomographically salient features from individual grid maps. The second part is to determine the search space using Gaussian mixture models based on the estimated map transformation matrix. The final part is to optimize an objective function modeled from tomographic information within the determined search space. Evaluation results with various pairs of individual maps produced by simulations and experiments showed that the proposed method can merge the individual maps more accurately than other map merging methods.


2021 ◽  
Vol 17 (2) ◽  
pp. 155014772199262
Author(s):  
Shiwen Chen ◽  
Junjian Yuan ◽  
Xiaopeng Xing ◽  
Xin Qin

Aiming at the shortcomings of the research on individual identification technology of emitters, which is primarily based on theoretical simulation and lack of verification equipment to conduct external field measurements, an emitter individual identification system based on Automatic Dependent Surveillance–Broadcast is designed. On one hand, the system completes the individual feature extraction of the signal preamble. On the other hand, it realizes decoding of the transmitter’s individual identity information and generates an individual recognition training data set, on which we can train the recognition network to achieve individual signal recognition. For the collected signals, six parameters were extracted as individual features. To reduce the feature dimensions, a Bessel curve fitting method is used for four of the features. The spatial distribution of the Bezier curve control points after fitting is taken as an individual feature. The processed features are classified with multiple classifiers, and the classification results are fused using the improved Dempster–Shafer evidence theory. Field measurements show that the average individual recognition accuracy of the system reaches 88.3%, which essentially meets the requirements.


Author(s):  
Pavitra Patel ◽  
A. A. Chaudhari ◽  
M. A. Pund ◽  
D. H. Deshmukh

<p>Speech emotion recognition is an important issue which affects the human machine interaction. Automatic recognition of human emotion in speech aims at recognizing the underlying emotional state of a speaker from the speech signal. Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly boosted by the Boosted-GMM algorithm as compared to the EM-GMM algorithm.<br />During this interaction, human beings have some feelings that they want to convey to their communication partner with whom they are communicating, and then their communication partner may be the human or machine. This work dependent on the emotion recognition of the human beings from their speech signal<br />Emotion recognition from the speaker’s speech is very difficult because of the following reasons: Because of the existence of the different sentences, speakers, speaking styles, speaking rates accosting variability was introduced. The same utterance may show different emotions. Therefore it is very difficult to differentiate these portions of utterance. Another problem is that emotion expression is depending on the speaker and his or her culture and environment. As the culture and environment gets change the speaking style also gets change, which is another challenge in front of the speech emotion recognition system.</p>


Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2063
Author(s):  
Isaac Barroso ◽  
João Tiago Guimarães ◽  
Milton Severo ◽  
Vanda Craveiro ◽  
Elisabete Ramos

Background: The immune system gradually matures early in life in the face of internal and external stimuli. Whether the immune responses are lasting and stable during the course of life is still unclear. Methods: As part of the EPITeen cohort, 1183 adolescents were prospectively evaluated at the ages of 13, 17, 21, 24 and 27. Sociodemographic, behavioral and clinical data were collected by self- and face-to-face-administered questionnaires, along with a physical examination comprising anthropometric measurements and blood sample collections. Mixed-effects models were used to identify individual trajectories of white blood cells (WBC) and finite Gaussian mixture models were used to identify the clusters of individual trajectories. Results: Participants were allocated into six clusters based on the individual trajectories of WBC distribution. Higher Inflammatory Activation Cluster (11.4%) had the highest total WBC count and neutrophils percentage, as well as the lowest percentage of lymphocytes. These participants had significantly higher odds of being overweight [OR = 2.44, 95%CI:1.51–3.92]. Lowest Levels of WBC Cluster (24.1%) had the lowest total WBC count, being characterized by a higher participation on sports [OR = 1.54, 95%CI:1.12–2.13]. Highest Proportion of Eosinophils Cluster (20.1%) had the highest eosinophils percentage and the highest likelihood of having been diagnosed with a chronic disease [OR = 2.11, 95%CI:1.43–3.13], namely “asthma or allergies” [OR = 14.0 (1.73, 112.2]. Lowest Proportion of Eosinophils Cluster (29.1%) had the lowest percentage of eosinophils and basophils, as well as the highest lymphocyte proportion. Participants in the Undefined Cluster (13.8%) showed the highest percentage of monocytes and basophils and were also characterized by significant lower odds of having parents with 7–9 years of schooling [OR = 0.56, (0.32, 0.99]. Conclusions: In this study we identified distinct immunological trajectories of WBC from adolescence to adulthood that were associated with social, clinical and behavioral determinants. These results suggest that these immunological trajectories are defined early in life, being dependent on the exposures.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8693 ◽  
Author(s):  
Sivaramakrishnan Rajaraman ◽  
Incheol Kim ◽  
Sameer K. Antani

Convolutional neural networks (CNNs) trained on natural images are extremely successful in image classification and localization due to superior automated feature extraction capability. In extending their use to biomedical recognition tasks, it is important to note that visual features of medical images tend to be uniquely different than natural images. There are advantages offered through training these networks on large scale medical common modality image collections pertaining to the recognition task. Further, improved generalization in transferring knowledge across similar tasks is possible when the models are trained to learn modality-specific features and then suitably repurposed for the target task. In this study, we propose modality-specific ensemble learning toward improving abnormality detection in chest X-rays (CXRs). CNN models are trained on a large-scale CXR collection to learn modality-specific features and then repurposed for detecting and localizing abnormalities. Model predictions are combined using different ensemble strategies toward reducing prediction variance and sensitivity to the training data while improving overall performance and generalization. Class-selective relevance mapping (CRM) is used to visualize the learned behavior of the individual models and their ensembles. It localizes discriminative regions of interest (ROIs) showing abnormal regions and offers an improved explanation of model predictions. It was observed that the model ensembles demonstrate superior localization performance in terms of Intersection of Union (IoU) and mean Average Precision (mAP) metrics than any individual constituent model.


Sign in / Sign up

Export Citation Format

Share Document