scholarly journals Statistical and Visual Analysis of Audio, Text, and Image Features for Multi-Modal Music Genre Recognition

Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1502
Author(s):  
Ben Wilkes ◽  
Igor Vatolkin ◽  
Heinrich Müller

We present a multi-modal genre recognition framework that considers the modalities audio, text, and image by features extracted from audio signals, album cover images, and lyrics of music tracks. In contrast to pure learning of features by a neural network as done in the related work, handcrafted features designed for a respective modality are also integrated, allowing for higher interpretability of created models and further theoretical analysis of the impact of individual features on genre prediction. Genre recognition is performed by binary classification of a music track with respect to each genre based on combinations of elementary features. For feature combination a two-level technique is used, which combines aggregation into fixed-length feature vectors with confidence-based fusion of classification results. Extensive experiments have been conducted for three classifier models (Naïve Bayes, Support Vector Machine, and Random Forest) and numerous feature combinations. The results are presented visually, with data reduction for improved perceptibility achieved by multi-objective analysis and restriction to non-dominated data. Feature- and classifier-related hypotheses are formulated based on the data, and their statistical significance is formally analyzed. The statistical analysis shows that the combination of two modalities almost always leads to a significant increase of performance and the combination of three modalities in several cases.

Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7417
Author(s):  
Alex J. Hope ◽  
Utkarsh Vashisth ◽  
Matthew J. Parker ◽  
Andreas B. Ralston ◽  
Joshua M. Roper ◽  
...  

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.


2021 ◽  
Author(s):  
Hanna Klimczak ◽  
Wojciech Kotłowski ◽  
Dagmara Oszkiewicz ◽  
Francesca DeMeo ◽  
Agnieszka Kryszczyńska ◽  
...  

<p>The aim of the project is the classification of asteroids according to the most commonly used asteroid taxonomy (Bus-Demeo et al. 2009) with the use of various machine learning methods like Logistic Regression, Naive Bayes, Support Vector Machines, Gradient Boosting and Multilayer Perceptrons. Different parameter sets are used for classification in order to compare the quality of prediction with limited amount of data, namely the difference in performance between using the 0.45mu to 2.45mu spectral range and multiple spectral features, as well as performing the Prinicpal Component Analysis to reduce the dimensions of the spectral data.</p> <p> </p> <p>This work has been supported by grant No. 2017/25/B/ST9/00740 from the National Science Centre, Poland.</p>


2020 ◽  
Vol 65 (1) ◽  
pp. 33-50 ◽  
Author(s):  
Chahira Mahjoub ◽  
Régine Le Bouquin Jeannès ◽  
Tarek Lajnef ◽  
Abdennaceur Kachouri

AbstractElectroencephalography (EEG) is a common tool used for the detection of epileptic seizures. However, the visual analysis of long-term EEG recordings is characterized by its subjectivity, time-consuming procedure and its erroneous detection. Various epileptic seizure detection algorithms have been proposed to deal with such issues. In this study, a novel automatic seizure-detection approach is proposed. Three different strategies are suggested to the user whereby he/she could choose the appropriate one for a given classification problem. Indeed, the feature extraction step, including both linear and nonlinear measures, is performed either directly from the EEG signals, or from the derived sub-bands of tunable-Q wavelet transform (TQWT), or even from the intrinsic mode functions (IMFs) of multivariate empirical mode decomposition (MEMD). The classification procedure is executed using a support vector machine (SVM). The performance of the proposed method is evaluated through a publicly available database from which six binary classification cases are formulated to discriminate between healthy, seizure and non-seizure EEG signals. Our results show high performance in terms of accuracy (ACC), sensitivity (SEN) and specificity (SPE) compared to the state-of-the-art approaches. Thus, the proposed approach for automatic seizure detection can be considered as a valuable alternative to existing methods, able to alleviate the overload of visual analysis and accelerate the seizure detection.


2018 ◽  
Vol 8 (12) ◽  
pp. 2649 ◽  
Author(s):  
Balakrishnan Ramalingam ◽  
Anirudh Lakshmanan ◽  
Muhammad Ilyas ◽  
Anh Le ◽  
Mohan Elara

Debris detection and classification is an essential function for autonomous floor-cleaning robots. It enables floor-cleaning robots to identify and avoid hard-to-clean debris, specifically large liquid spillage debris. This paper proposes a debris-detection and classification scheme for an autonomous floor-cleaning robot using a deep Convolutional Neural Network (CNN) and Support Vector Machine (SVM) cascaded technique. The SSD (Single-Shot MultiBox Detector) MobileNet CNN architecture is used for classifying the solid and liquid spill debris on the floor through the captured image. Then, the SVM model is employed for binary classification of liquid spillage regions based on size, which helps floor-cleaning devices to identify the larger liquid spillage debris regions, considered as hard-to-clean debris in this work. The experimental results prove that the proposed technique can efficiently detect and classify the debris on the floor and achieves 95.5% percent classification accuracy. The cascaded approach takes approximately 71 milliseconds for the entire process of debris detection and classification, which implies that the proposed technique is suitable for deploying in real-time selective floor-cleaning applications.


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4629 ◽  
Author(s):  
Ciaran Cooney ◽  
Attila Korik ◽  
Raffaella Folli ◽  
Damien Coyle

Classification of electroencephalography (EEG) signals corresponding to imagined speech production is important for the development of a direct-speech brain–computer interface (DS-BCI). Deep learning (DL) has been utilized with great success across several domains. However, it remains an open question whether DL methods provide significant advances over traditional machine learning (ML) approaches for classification of imagined speech. Furthermore, hyperparameter (HP) optimization has been neglected in DL-EEG studies, resulting in the significance of its effects remaining uncertain. In this study, we aim to improve classification of imagined speech EEG by employing DL methods while also statistically evaluating the impact of HP optimization on classifier performance. We trained three distinct convolutional neural networks (CNN) on imagined speech EEG using a nested cross-validation approach to HP optimization. Each of the CNNs evaluated was designed specifically for EEG decoding. An imagined speech EEG dataset consisting of both words and vowels facilitated training on both sets independently. CNN results were compared with three benchmark ML methods: Support Vector Machine, Random Forest and regularized Linear Discriminant Analysis. Intra- and inter-subject methods of HP optimization were tested and the effects of HPs statistically analyzed. Accuracies obtained by the CNNs were significantly greater than the benchmark methods when trained on both datasets (words: 24.97%, p < 1 × 10–7, chance: 16.67%; vowels: 30.00%, p < 1 × 10–7, chance: 20%). The effects of varying HP values, and interactions between HPs and the CNNs were both statistically significant. The results of HP optimization demonstrate how critical it is for training CNNs to decode imagined speech.


Author(s):  
S. Boeke ◽  
M. J. C. van den Homberg ◽  
A. Teklesadik ◽  
J. L. D. Fabila ◽  
D. Riquet ◽  
...  

Abstract. Reliable predictions of the impact of natural hazards turning into a disaster is important for better targeting humanitarian response as well as for triggering early action. Open data and machine learning can be used to predict loss and damage to the houses and livelihoods of affected people. This research focuses on agricultural loss, more specifically rice loss in the Philippines due to typhoons. Regression and binary classification algorithms are trained using feature selection methods to find the most important explanatory features. Both geographical data from every province, and typhoon specific features of 11 historical typhoons are used as input. The percentage of lost rice area is considered as the output, with an average value of 7.1%. As for the regression task, the support vector regressor performed best with a Mean Absolute Error of 6.83 percentage points. For the classification model, thresholds of 20%, 30% and 40% are tested in order to find the best performing model. These thresholds represent different levels of lost rice fields for triggering anticipatory action towards farmers. The binary classifiers are trained to increase its ability to rightly predict the positive samples. In all three cases, the support vector classifier performed the best with a recall score of 88%, 75% and 81.82%, respectively. However, the precision score for each of these models was low: 17.05%, 14.46% and 10.84%, respectively. For both the support vector regressor and classifier, of all 14 available input features, only wind speed was selected as explanatory feature. Yet, for the other algorithms that were trained in this study, other sets of features were selected depending also on the hyperparameter settings. This variation in selected feature sets as well as the imprecise predictions were consequences of the small dataset that was used for this study. It is therefore important that data for more typhoons as well as data on other explanatory variables are gathered in order to make more robust and accurate predictions. Also, if loss data becomes available on municipality-level, rather than province-level, the models will become more accurate and valuable for operationalization.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5896
Author(s):  
Eddi Miller ◽  
Vladyslav Borysenko ◽  
Moritz Heusinger ◽  
Niklas Niedner ◽  
Bastian Engelmann ◽  
...  

Changeover times are an important element when evaluating the Overall Equipment Effectiveness (OEE) of a production machine. The article presents a machine learning (ML) approach that is based on an external sensor setup to automatically detect changeovers in a shopfloor environment. The door statuses, coolant flow, power consumption, and operator indoor GPS data of a milling machine were used in the ML approach. As ML methods, Decision Trees, Support Vector Machines, (Balanced) Random Forest algorithms, and Neural Networks were chosen, and their performance was compared. The best results were achieved with the Random Forest ML model (97% F1 score, 99.72% AUC score). It was also carried out that model performance is optimal when only a binary classification of a changeover phase and a production phase is considered and less subphases of the changeover process are applied.


Mekatronika ◽  
2021 ◽  
Vol 3 (1) ◽  
pp. 1-9
Author(s):  
Wang Yan ◽  
HouJun Lu ◽  
Chun Sern Choong

It is difficult to spot failures in port machinery and equipment, and maintaining such systems is even more complex. Maintenance such modifications in a reasonable time  is a tough challenge since each change might have an endless number of test cases run. It's critical to have a risk assessment of the impact of such maintenance fixes. In the software engineering community, there has been a considerable amount of study on failure prediction. Regrettably, there is little evidence of their application in day-to-day software development in port machinery and equipment. In this paper, we propose an unsupervised machine learning (k-means clustering) method for categorising cranes for maintenance and use a machine learning pipeline to solve the classification of crane failure data. The crane's maintenance decision data demonstrates the method's effectiveness. It was demonstrated that the Linear Support Vector Machine could give a superior classification accuracy of crane maintenance prediction with a 100 percent accuracy in train set and 94.5 percent accuracy in test set.


Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4322
Author(s):  
Hui Wei ◽  
Weiwei Shu ◽  
Longjun Dong ◽  
Zhongying Huang ◽  
Daoyuan Sun

The discrimination of micro-seismic events (events) and blasts is significant for monitoring and analyzing micro-seismicity in underground mines. To eliminate the negative effects of conventional discrimination methods, a waveform image discriminant method was proposed. Principal component analysis (PCA) was applied to extract the raw features of events and blasts through their waveform images that established by the recorded field data, and transform them into the new uncorrelated features. The amount of initial information retained in the derived features could be determined quantitatively by the contribution rate. The binary classification models were established by utilizing the support vector machine (SVM) algorithm and the PCA derived waveform image features. Results of four groups of cross validation show that the optimal values for the accuracy of events and blasts, total accuracy, and quality evaluation parameter MCC are 97.1%, 93.8%, 93.60%, and 0.8723, respectively. Moreover, the computation efficiency per accuracy (CEA) was introduced to quantitatively evaluate the effects of contribution rate on classification accuracy and computation efficiency. The optimal contribution rate was determined to be 0.90. The waveform image discriminant method can automatically classify events and blasts in underground mines, ensuring the efficient establishment of high-quality micro-seismic databases and providing adequate data for the subsequent seismicity analysis.


Sign in / Sign up

Export Citation Format

Share Document