Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds

The computerized modeling of cognitive visual information has been a research field of great interest in the past several decades. The research field is interesting not only from a biological perspective, but also from an engineering point of view when systems are developed that aim to achieve similar goals as biological cognitive systems. This article introduces a general framework for the extraction and systematic storage of low-level visual features. The applicability of the framework is investigated in both unstructured and highly structured environments. In a first experiment, a linear categorization algorithm originally developed for the classification of text documents is used to classify natural images taken from the Caltech 101 database. In a second experiment, the framework is used to provide an automatically guided vehicle with obstacle detection and auto-positioning functionalities in highly structured environments. Results demonstrate that the model is highly applicable in structured environments, and also shows promising results in certain cases when used in unstructured environments.

Download Full-text

Automated speech analysis enables MCI diagnosis

Proceedings of 11th International Conference of Experimental Linguistics ◽

10.36505/exling-2020/11/0050/000465 ◽

2020 ◽

Author(s):

Charalambos Themistocleous ◽

Marie Eckerström ◽

Dimitrios Kokkinakis

Keyword(s):

Mini Mental State Examination ◽

Screening Tool ◽

Acoustic Properties ◽

Screening Tools ◽

Speech Analysis ◽

Healthy Controls ◽

Acoustic Feature ◽

Language Analysis ◽

State Examination

Mild Cognitive Impairment (MCI) is a condition characterized by cognitive decline greater than expected for an individual's age and education level. In this study, we are investigating whether acoustic properties of speech production can improve the classification of individuals with MCI from healthy controls augmenting the Mini Mental State Examination, a traditional screening tool, with automatically extracted acoustic information. We found that just one acoustic feature, can improve the AUC score (measuring a trade-off between sensitivity and specificity) from 0.77 to 0.89 in a boosting classification task. These preliminary results suggest that computerized language analysis can improve the accuracy of traditional screening tools.

Download Full-text

Large-Scale Whale-Call Classification by Transfer Learning on Multi-Scale Waveforms and Time-Frequency Features

Applied Sciences ◽

10.3390/app9051020 ◽

2019 ◽

Vol 9 (5) ◽

pp. 1020 ◽

Cited By ~ 6

Author(s):

Lilun Zhang ◽

Dezhi Wang ◽

Changchun Bao ◽

Yongxian Wang ◽

Kele Xu

Keyword(s):

Transfer Learning ◽

Large Scale ◽

Data Augmentation ◽

Feature Representation ◽

Biological Research ◽

Time Frequency ◽

Feature Representations ◽

Multi Scale ◽

Data Driven Approach

Whale vocal calls contain valuable information and abundant characteristics that are important for classification of whale sub-populations and related biological research. In this study, an effective data-driven approach based on pre-trained Convolutional Neural Networks (CNN) using multi-scale waveforms and time-frequency feature representations is developed in order to perform the classification of whale calls from a large open-source dataset recorded by sensors carried by whales. Specifically, the classification is carried out through a transfer learning approach by using pre-trained state-of-the-art CNN models in the field of computer vision. 1D raw waveforms and 2D log-mel features of the whale-call data are respectively used as the input of CNN models. For raw waveform input, windows are applied to capture multiple sketches of a whale-call clip at different time scales and stack the features from different sketches for classification. When using the log-mel features, the delta and delta-delta features are also calculated to produce a 3-channel feature representation for analysis. In the training, a 4-fold cross-validation technique is employed to reduce the overfitting effect, while the Mix-up technique is also applied to implement data augmentation in order to further improve the system performance. The results show that the proposed method can improve the accuracies by more than 20% in percentage for the classification into 16 whale pods compared with the baseline method using groups of 2D shape descriptors of spectrograms and the Fisher discriminant scores on the same dataset. Moreover, it is shown that classifications based on log-mel features have higher accuracies than those based directly on raw waveforms. The phylogeny graph is also produced to significantly illustrate the relationships among the whale sub-populations.

Download Full-text

Acoustic feature selection utilizing multiple kernel learning for classification of children with autism spectrum and typically developing children

Proceedings of the 2013 IEEE/SICE International Symposium on System Integration ◽

10.1109/sii.2013.6776604 ◽

2013 ◽

Cited By ~ 2

Author(s):

Yasuhiro Kakihara ◽

Tetsuya Takiguchi ◽

Yasuo Ariki ◽

Yasushi Nakai ◽

Satoshi Takada

Keyword(s):

Feature Selection ◽

Multiple Kernel Learning ◽

Children With Autism ◽

Autism Spectrum ◽

Kernel Learning ◽

Typically Developing ◽

Acoustic Feature ◽

Typically Developing Children ◽

Multiple Kernel

Download Full-text

The BIO-acoustic feature extraction and classification of bat echolocation calls

2012 IEEE International Conference on Electro/Information Technology ◽

10.1109/eit.2012.6220700 ◽

2012 ◽

Cited By ~ 4

Author(s):

Golrokh Mirzaei ◽

Mohammad Wadood Majid ◽

Jeremy Ross ◽

Mohsin M. Jamali ◽

Peter V. Gorsevski ◽

...

Keyword(s):

Feature Extraction ◽

Acoustic Feature ◽

Echolocation Calls

Download Full-text

Power Exponential Densities for the Training and Classification of Acoustic Feature Vectors in Speech Recognition

Journal of Computational and Graphical Statistics ◽

10.1198/10618600152418818 ◽

2001 ◽

Vol 10 (1) ◽

pp. 158-184 ◽

Cited By ~ 6

Author(s):

Sankar Basu ◽

Charles A Micchelli ◽

Peder Olsen

Keyword(s):

Speech Recognition ◽

Acoustic Feature ◽

Feature Vectors

Download Full-text

Contrasitive Learning for 3D Point Clouds Classification and Shape Completion

10.20944/preprints202109.0112.v1 ◽

2021 ◽

Author(s):

Danish Nazir ◽

Muhammad Zeshan Afzal ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker

Keyword(s):

Point Clouds ◽

Classification Performance ◽

Local Feature ◽

Feature Representations ◽

3D Point Clouds ◽

Chamfer Distance ◽

Shape Completion ◽

Number Of Classes ◽

Global And Local

In this paper, we present the idea of Self Supervised learning on the Shape Completion and Classification of point clouds. Most 3D shape completion pipelines utilize autoencoders to extract features from point clouds used in downstream tasks such as Classification, Segmentation, Detection, and other related applications. Our idea is to add Contrastive Learning into Auto-Encoders to learn both global and local feature representations of point clouds. We use a combination of Triplet Loss and Chamfer distance to learn global and local feature representations. To evaluate the performance of embeddings for Classification, we utilize the PointNet classifier. We also extend the number of classes to evaluate our model from 4 to 10 to show the generalization ability of learned features. Based on our results, embedding generated from the Contrastive autoencoder enhances Shape Completion and Classification performance from 84.2% to 84.9% of point clouds achieving the state-of-the-art results with 10 classes.

Download Full-text

Contrastive Learning for 3D Point Clouds Classification and Shape Completion

Sensors ◽

10.3390/s21217392 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7392

Author(s):

Danish Nazir ◽

Muhammad Zeshan Afzal ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker

Keyword(s):

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Classification Performance ◽

Feature Representations ◽

3D Point Clouds ◽

Chamfer Distance ◽

Shape Completion ◽

Number Of Classes

In this paper, we present the idea of Self Supervised learning on the shape completion and classification of point clouds. Most 3D shape completion pipelines utilize AutoEncoders to extract features from point clouds used in downstream tasks such as classification, segmentation, detection, and other related applications. Our idea is to add contrastive learning into AutoEncoders to encourage global feature learning of the point cloud classes. It is performed by optimizing triplet loss. Furthermore, local feature representations learning of point cloud is performed by adding the Chamfer distance function. To evaluate the performance of our approach, we utilize the PointNet classifier. We also extend the number of classes for evaluation from 4 to 10 to show the generalization ability of the learned features. Based on our results, embeddings generated from the contrastive AutoEncoder enhances shape completion and classification performance from 84.2% to 84.9% of point clouds achieving the state-of-the-art results with 10 classes.

Download Full-text