Distance Metric Learnt Kernel-Based Music Classification Using Timbral Descriptors
Automatic music genre classification based on distance metric learning (DML) is proposed in this paper. Three types of timbral descriptors, namely, mel-frequency cepstral coefficient (MFCC) features, modified group delay features (MODGDF) and low-level timbral feature sets are combined at the feature level. We experimented with k nearest neighbor (kNN) and support vector machine (SVM)-based classifiers for standard and DML kernels (DMLK) using GTZAN and Folk music dataset. Standard kernel-based kNN and SVM-based classifiers report classification accuracy (in%) of 79.03 and 90.16, respectively, on GTZAN dataset and 86.60 and 92.26, respectively, for Folk music dataset, with the best performing RBF kernel. A further improvement was observed when DML kernels were used in place of standard kernels in the kernel kNN and SVM-based classifiers with an accuracy of 84.46%, 92.74% (GTZAN), 90.00 and 96.23 (Folk music dataset) for DMLK-kNN and DMLK-SVM, respectively. The results demonstrate the potential of DML kernels in music genre classification task.