scholarly journals A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations

2021 ◽  
Author(s):  
Mara Thomas ◽  
Frants Jensen ◽  
Baptiste Averly ◽  
Vlad Demartsev ◽  
Marta B. Manser ◽  
...  

The manual detection, analysis, and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups, and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighborhood-based dimensionality reduction of spectrograms to produce a latent-space representation of calls stands out for its conceptual simplicity and effectiveness. Using a dataset of manually annotated meerkat (Suricata suricatta) vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyze strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabeled calls. All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.

Autoencoders (AE) are a family of neural networks for which the input is the same as the output. They work by compressing the input into a latent-space representation and then reconstructing the output from this representation. The aim of an Autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”. In this paper De-noising Autoencoder is implemented by proposing a novel approach on MNIST handwritten digits. This model is validated through training and validation losses, and observing the reconstructed test images when comparing to the original images. The proposed model is found to be working very well.


2021 ◽  
Vol 13 (2) ◽  
pp. 51
Author(s):  
Lili Sun ◽  
Xueyan Liu ◽  
Min Zhao ◽  
Bo Yang

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.


Author(s):  
Alireza Vafaei Sadr ◽  
Bruce A. Bassett ◽  
M. Kunz

AbstractAnomaly detection is challenging, especially for large datasets in high dimensions. Here, we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. DRAMA is released as a general python package that implements the general framework with a wide range of built-in options. This approach identifies the primary prototypes in the data with anomalies detected by their large distances from the prototypes, either in the latent space or in the original, high-dimensional space. DRAMA is tested on a wide variety of simulated and real datasets, in up to 3000 dimensions, and is found to be robust and highly competitive with commonly used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning, and highly unbalanced datasets. Besides, DRAMA naturally provides clustering of outliers for subsequent analysis.


Cancers ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 2419
Author(s):  
Georg Steinbuss ◽  
Mark Kriegsmann ◽  
Christiane Zgorzelski ◽  
Alexander Brobeil ◽  
Benjamin Goeppert ◽  
...  

The diagnosis and the subtyping of non-Hodgkin lymphoma (NHL) are challenging and require expert knowledge, great experience, thorough morphological analysis, and often additional expensive immunohistological and molecular methods. As these requirements are not always available, supplemental methods supporting morphological-based decision making and potentially entity subtyping are required. Deep learning methods have been shown to classify histopathological images with high accuracy, but data on NHL subtyping are limited. After annotation of histopathological whole-slide images and image patch extraction, we trained and optimized an EfficientNet convolutional neuronal network algorithm on 84,139 image patches from 629 patients and evaluated its potential to classify tumor-free reference lymph nodes, nodal small lymphocytic lymphoma/chronic lymphocytic leukemia, and nodal diffuse large B-cell lymphoma. The optimized algorithm achieved an accuracy of 95.56% on an independent test set including 16,960 image patches from 125 patients after the application of quality controls. Automatic classification of NHL is possible with high accuracy using deep learning on histopathological images and routine diagnostic applications should be pursued.


2021 ◽  
Vol 11 (3) ◽  
pp. 1013
Author(s):  
Zvezdan Lončarević ◽  
Rok Pahič ◽  
Aleš Ude ◽  
Andrej Gams

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.


2021 ◽  
Vol 15 ◽  
pp. 174830262110249
Author(s):  
Cong-Zhe You ◽  
Zhen-Qiu Shu ◽  
Hong-Hui Fan

Recently, in the area of artificial intelligence and machine learning, subspace clustering of multi-view data is a research hotspot. The goal is to divide data samples from different sources into different groups. We proposed a new subspace clustering method for multi-view data which termed as Non-negative Sparse Laplacian regularized Latent Multi-view Subspace Clustering (NSL2MSC) in this paper. The method proposed in this paper learns the latent space representation of multi view data samples, and performs the data reconstruction on the latent space. The algorithm can cluster data in the latent representation space and use the relationship of different views. However, the traditional representation-based method does not consider the non-linear geometry inside the data, and may lose the local and similar information between the data in the learning process. By using the graph regularization method, we can not only capture the global low dimensional structural features of data, but also fully capture the nonlinear geometric structure information of data. The experimental results show that the proposed method is effective and its performance is better than most of the existing alternatives.


2021 ◽  
Author(s):  
Sibghatullah I. Khan ◽  
Vikram Palodiya ◽  
Lavanya Poluboyina

Abstract Bronchiectasis and chronic obstructive pulmonary disease (COPD) are common human lung diseases. In general, the expert pulmonologistcarries preliminary screening and detection of these lung abnormalities by listening to the adventitious lung sounds. The present paper is an attempt towards the automatic detection of adventitious lung sounds ofBronchiectasis,COPD from normal lung sounds of healthy subjects. For classification of the lung sounds into a normaland adventitious category, we obtain features from phase space representation (PSR). At first, the empirical mode decomposition (EMD) is applied to lung sound signals to obtain intrinsic mode functions (IMFs). The IMFs are then further processed to construct two dimensional (2D) and three dimensional (3D) PSR. The feature space includes the 95% confidence ellipse area and interquartile range (IQR) of Euclidian distances computed from 2D and 3D PSRs, respectively. The process is carried out for the first four IMFs correspondings to normal and adventitious lung sound signals. The computed features depicta significant ability to discriminate the two categories of lung sound signals.To perform classification, we use the least square support vector machine with two kernels, namely, polynomial and radial basis function (RBF).Simulation outcomes on ICBHI 2017 lung sound dataset show the ability of the proposed method in effectively classifying normal and adventitious lung sound signals. LS-SVM is employing RBF kernel provides the highest classification accuracy of 97.67 % over feature space constituted by first, second, and fourth IMF.


Sign in / Sign up

Export Citation Format

Share Document