Uniform Manifold Approximation and Projection for Clustering Taxa through Vocalizations in a Neotropical Passerine (Rough-Legged Tyrannulet, Phyllomyias burmeisteri)

Vocalizations from birds are a fruitful source of information for the classification of species. However, currently used analyses are ineffective to determine the taxonomic status of some groups. To provide a clearer grouping of taxa for such bird species from the analysis of vocalizations, more sensitive techniques are required. In this study, we have evaluated the sensitivity of the Uniform Manifold Approximation and Projection (UMAP) technique for grouping the vocalizations of individuals of the Rough-legged Tyrannulet Phyllomyias burmeisteri complex. Although the existence of two taxonomic groups has been suggested by some studies, the species has presented taxonomic difficulties in classification in previous studies. UMAP exhibited a clearer separation of groups than previously used dimensionality-reduction techniques (i.e., principal component analysis), as it was able to effectively identify the two taxa groups. The results achieved with UMAP in this study suggest that the technique can be useful in the analysis of species with complex in taxonomy through vocalizations data as a complementary tool including behavioral traits such as acoustic communication.

Download Full-text

Principal Component Analysis

ACM Computing Surveys ◽

10.1145/3447755 ◽

2021 ◽

Vol 54 (4) ◽

pp. 1-34

Author(s):

Felipe L. Gewers ◽

Gustavo R. Ferreira ◽

Henrique F. De Arruda ◽

Filipi N. Silva ◽

Cesar H. Comin ◽

...

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Real World ◽

Principal Component ◽

Component Analysis ◽

Basic Principles ◽

Data Standardization ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Real World Datasets

Principal component analysis (PCA) is often applied for analyzing data in the most diverse areas. This work reports, in an accessible and integrated manner, several theoretical and practical aspects of PCA. The basic principles underlying PCA, data standardization, possible visualizations of the PCA results, and outlier detection are subsequently addressed. Next, the potential of using PCA for dimensionality reduction is illustrated on several real-world datasets. Finally, we summarize PCA-related approaches and other dimensionality reduction techniques. All in all, the objective of this work is to assist researchers from the most diverse areas in using and interpreting PCA.

Download Full-text

Performance Analysis of Dimensionality Reduction Techniques in the Context of Clustering

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s3.2084 ◽

2019 ◽

Vol 8 (S3) ◽

pp. 66-71

Author(s):

T. Sudha ◽

P. Nagendra Kumar

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

High Dimensional Data ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Low Dimensional ◽

Probabilistic Principal Component Analysis

Data mining is one of the major areas of research. Clustering is one of the main functionalities of datamining. High dimensionality is one of the main issues of clustering and Dimensionality reduction can be used as a solution to this problem. The present work makes a comparative study of dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis in the context of clustering. High dimensional data have been reduced to low dimensional data using dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis. Cluster analysis has been performed on the high dimensional data as well as the low dimensional data sets obtained through t-distributed stochastic neighbour embedding and Probabilistic principal component analysis with varying number of clusters. Mean squared error; time and space have been considered as parameters for comparison. The results obtained show that time taken to convert the high dimensional data into low dimensional data using probabilistic principal component analysis is higher than the time taken to convert the high dimensional data into low dimensional data using t-distributed stochastic neighbour embedding.The space required by the data set reduced through Probabilistic principal component analysis is less than the storage space required by the data set reduced through t-distributed stochastic neighbour embedding.

Download Full-text

Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods

Evolutionary Biology ◽

10.1007/s11692-018-9464-9 ◽

2018 ◽

Vol 46 (1) ◽

pp. 106-121 ◽

Cited By ~ 4

Author(s):

Trina Y. Du

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Morphometric Data ◽

Nonlinear Methods ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Download Full-text

Evaluation of projections, obtained by dimensionality reduction techniques

Lietuvos matematikos rinkinys ◽

10.15388/lmr.b.2014.26 ◽

2014 ◽

Vol 55 ◽

Author(s):

Kotryna Paulauskienė ◽

Olga Kurasova

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Topology Preservation ◽

Data Set ◽

Evaluation Measures ◽

Reduction Techniques ◽

Reduction Problem ◽

Dimensionality Reduction Techniques ◽

As Stress ◽

Artificial Datasets

In this paper, the projection evaluation measures such as stress function, Spearman’s rho, Konig’s topology preservation, silhouette and Renyi entropy have been analyzed. The principal component analysis (PCA) and part–linear multidimensional projection (PLMP) techniques are used to reduce the dimensionality of the initial data set. The experiments have been carried out with seven real and artificial datasets. The experimental investigation has shown that several quality evaluationmeasures have to be used when dimension reduction problem is solved.

Download Full-text

Dimensionality reduction for EEG-based sleep stage detection: comparison of autoencoders, principal component analysis and factor analysis

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2020-0139 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Alexandra-Maria Tăuţan ◽

Alessandro C. Rossi ◽

Ruben de Francisco ◽

Bogdan Ionescu

Keyword(s):

Principal Component Analysis ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Promising Result ◽

Experimental Testing ◽

Sleep Stage ◽

Principal Component ◽

Component Analysis ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

AbstractMethods developed for automatic sleep stage detection make use of large amounts of data in the form of polysomnographic (PSG) recordings to build predictive models. In this study, we investigate the effect of several dimensionality reduction techniques, i.e., principal component analysis (PCA), factor analysis (FA), and autoencoders (AE) on common classifiers, e.g., random forests (RF), multilayer perceptron (MLP), long-short term memory (LSTM) networks, for automated sleep stage detection. Experimental testing is carried out on the MGH Dataset provided in the “You Snooze, You Win: The PhysioNet/Computing in Cardiology Challenge 2018”. The signals used as input are the six available (EEG) electoencephalographic channels and combinations with the other PSG signals provided: ECG – electrocardiogram, EMG – electromyogram, respiration based signals – respiratory efforts and airflow. We observe that a similar or improved accuracy is obtained in most cases when using all dimensionality reduction techniques, which is a promising result as it allows to reduce the computational load while maintaining performance and in some cases also improves the accuracy of automated sleep stage detection. In our study, using autoencoders for dimensionality reduction maintains the performance of the model, while using PCA and FA the accuracy of the models is in most cases improved.

Download Full-text

Classification of Observations through Combination of the Dimension Reduction and the Cluster Analysis

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.13 ◽

2017 ◽

Vol 7 (8) ◽

pp. 30

Author(s):

Hyeuk Kim

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Cluster Analysis ◽

Unsupervised Learning ◽

Principal Component ◽

Component Analysis ◽

Baseball Players ◽

Partitioning Around Medoids ◽

Different Characteristics

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.

Download Full-text