Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.

Get full-text (via PubEx)

Imitation Learning System Design with Small Training Data for Flexible Tool Manipulation

International Journal of Automation Technology ◽

10.20965/ijat.2021.p0669 ◽

2021 ◽

Vol 15 (5) ◽

pp. 669-677

Author(s):

Harumo Sasatake ◽

Ryosuke Tasaki ◽

Takahito Yamashita ◽

Naoki Uchiyama ◽

◽

...

Keyword(s):

Deep Learning ◽

Expert Knowledge ◽

Teaching Method ◽

Population Aging ◽

Developed Countries ◽

Learning System ◽

Training Data ◽

Robot Arm ◽

Flexible Tool ◽

Human Labor

Population aging has become a major problem in developed countries. As the labor force declines, robot arms are expected to replace human labor for simple tasks. A robotic arm attaches a tool specialized for a task and acquires the movement through teaching by an engineer with expert knowledge. However, the number of such engineers is limited; therefore, a teaching method that can be used by non-technical personnel is necessitated. As a teaching method, deep learning can be used to imitate human behavior and tool usage. However, deep learning requires a large amount of training data for learning. In this study, the target task of the robot is to sweep multiple pieces of dirt using a broom. The proposed learning system can estimate the initial parameters for deep learning based on experience, as well as the shape and physical properties of the tools. It can reduce the number of training data points when learning a new tool. A virtual reality system is used to move the robot arm easily and safely, as well as to create training data for imitation. In this study, cleaning experiments are conducted to evaluate the effectiveness of the proposed method. The experimental results confirm that the proposed method can accelerate the learning speed of deep learning and acquire cleaning ability using a small amount of training data.

Get full-text (via PubEx)

Diagnostic assessment of a deep learning system for detecting atrial fibrillation in pulse waveforms

Heart ◽

10.1136/heartjnl-2018-313147 ◽

2018 ◽

Vol 104 (23) ◽

pp. 1921-1928 ◽

Cited By ~ 36

Author(s):

Ming-Zher Poh ◽

Yukkee Cheung Poh ◽

Pak-Hei Chan ◽

Chun-Ka Wong ◽

Louise Pun ◽

...

Keyword(s):

Atrial Fibrillation ◽

Deep Learning ◽

Test Data ◽

Predictive Value ◽

Characteristic Curve ◽

Performance Comparison ◽

Learning System ◽

Training Data ◽

Validation Data ◽

Data Set

ObjectiveTo evaluate the diagnostic performance of a deep learning system for automated detection of atrial fibrillation (AF) in photoplethysmographic (PPG) pulse waveforms.MethodsWe trained a deep convolutional neural network (DCNN) to detect AF in 17 s PPG waveforms using a training data set of 149 048 PPG waveforms constructed from several publicly available PPG databases. The DCNN was validated using an independent test data set of 3039 smartphone-acquired PPG waveforms from adults at high risk of AF at a general outpatient clinic against ECG tracings reviewed by two cardiologists. Six established AF detectors based on handcrafted features were evaluated on the same test data set for performance comparison.ResultsIn the validation data set (3039 PPG waveforms) consisting of three sequential PPG waveforms from 1013 participants (mean (SD) age, 68.4 (12.2) years; 46.8% men), the prevalence of AF was 2.8%. The area under the receiver operating characteristic curve (AUC) of the DCNN for AF detection was 0.997 (95% CI 0.996 to 0.999) and was significantly higher than all the other AF detectors (AUC range: 0.924–0.985). The sensitivity of the DCNN was 95.2% (95% CI 88.3% to 98.7%), specificity was 99.0% (95% CI 98.6% to 99.3%), positive predictive value (PPV) was 72.7% (95% CI 65.1% to 79.3%) and negative predictive value (NPV) was 99.9% (95% CI 99.7% to 100%) using a single 17 s PPG waveform. Using the three sequential PPG waveforms in combination (<1 min in total), the sensitivity was 100.0% (95% CI 87.7% to 100%), specificity was 99.6% (95% CI 99.0% to 99.9%), PPV was 87.5% (95% CI 72.5% to 94.9%) and NPV was 100% (95% CI 99.4% to 100%).ConclusionsIn this evaluation of PPG waveforms from adults screened for AF in a real-world primary care setting, the DCNN had high sensitivity, specificity, PPV and NPV for detecting AF, outperforming other state-of-the-art methods based on handcrafted features.

Get full-text (via PubEx)

Event Detection for Distributed Acoustic Sensing: Combining Knowledge-Based, Classical Machine Learning, and Deep Learning Approaches

Sensors ◽

10.3390/s21227527 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7527

Author(s):

Mugdim Bublin

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Event Detection ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

Efficient System ◽

Acoustic Sensing ◽

Advantages And Disadvantages ◽

Knowledge Based ◽

Distributed Acoustic Sensing

Distributed Acoustic Sensing (DAS) is a promising new technology for pipeline monitoring and protection. However, a big challenge is distinguishing between relevant events, like intrusion by an excavator near the pipeline, and interference, like land machines. This paper investigates whether it is possible to achieve adequate detection accuracy with classic machine learning algorithms using simulations and real system implementation. Then, we compare classical machine learning with a deep learning approach and analyze the advantages and disadvantages of both approaches. Although acceptable performance can be achieved with both approaches, preliminary results show that deep learning is the more promising approach, eliminating the need for laborious feature extraction and offering a six times lower event detection delay and twelve times lower execution time. However, we achieved the best results by combining deep learning with the knowledge-based and classical machine learning approaches. At the end of this manuscript, we propose general guidelines for efficient system design combining knowledge-based, classical machine learning, and deep learning approaches.

Get full-text (via PubEx)

Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection

10.21437/interspeech.2018-1120 ◽

2018 ◽

Cited By ~ 1

Author(s):

Shao-Yen Tseng ◽

Juncheng Li ◽

Yun Wang ◽

Florian Metze ◽

Joseph Szurley ◽

...

Keyword(s):

Deep Learning ◽

Event Detection ◽

Small Footprint ◽

Audio Event ◽

Weakly Supervised

Get full-text (via PubEx)

Device Invariant Deep Neural Networks for Pulmonary Audio Event Detection Across Mobile and Wearable Devices

10.1109/embc46164.2021.9629853 ◽

2021 ◽

Author(s):

Mohsin Y Ahmed ◽

Li Zhu ◽

Md Mahbubur Rahman ◽

Tousif Ahmed ◽

Jilong Kuang ◽

...

Keyword(s):

Neural Networks ◽

Event Detection ◽

Deep Neural Networks ◽

Wearable Devices ◽

Audio Event

Get full-text (via PubEx)

Detection of Fiber Defects Using Keypoints and Deep Learning

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421500166 ◽

2020 ◽

pp. 2150016

Author(s):

Dirk Siegmund ◽

Biying Fu ◽

Adán José-García ◽

Ahmad Salahuddin ◽

Arjan Kuijper

Keyword(s):

Deep Learning ◽

Deep Neural Networks ◽

Unsupervised Classification ◽

Training Data ◽

Manual Task ◽

Processing Pipeline ◽

Vision Inspection ◽

Exact Position ◽

Supervised Methods ◽

Industrial Textiles

Due to the deforming and dynamically changing textile fibers, the quality assurance of cleaned industrial textiles is still a mostly manual task. Usually, textiles need to be spread flat, in order to detect defects using computer vision inspection methods. Already known methods for detecting defects on such inhomogeneous, voluminous surfaces use mainly supervised methods based on deep neural networks and require lots of labeled training data. In contrast, we present a novel unsupervised method, based on SURF keypoints, that does not require any training data. We propose using their location, number and orientation in order to group them into geographically close clusters. Keypoint clusters also indicate the exact position of the defect at the same time. We furthermore compared our approach to supervised methods using deep learning. The presented processing pipeline shows how normalization and classification methods need to be combined, in order to reliably detect fiber defects such as cuts and holes. We evaluate the performance of our system in real-world settings with images of piles of textiles, taken in stereo vision. Our results show that our novel unsupervised classification method using keypoint clustering achieves comparable results to other supervised methods.

Get full-text (via PubEx)

Attacking Audio Event Detection Deep Learning Classifiers with White Noise

The 14th PErvasive Technologies Related to Assistive Environments Conference ◽

10.1145/3453892.3464893 ◽

2021 ◽

Author(s):

Rodrigo Augusto dos Santos ◽

Shirin Nilizadeh ◽

Ashwitha Venkata Kassetty

Keyword(s):

Deep Learning ◽

White Noise ◽

Event Detection ◽

Learning Classifiers ◽

Audio Event

Get full-text (via PubEx)

Audio Event Detection Using Wireless Sensor Networks Based on Deep Learning

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Wireless Internet ◽

10.1007/978-3-030-06158-6_11 ◽

2019 ◽

pp. 105-115

Author(s):

Jose Marie Mendoza ◽

Vanessa Tan ◽

Vivencio Fuentes ◽

Gabriel Perez ◽

Nestor Michael Tiglao

Keyword(s):

Wireless Sensor Networks ◽

Deep Learning ◽

Sensor Networks ◽

Event Detection ◽

Wireless Sensor ◽

Audio Event

Get full-text (via PubEx)

On regularization properties of artificial datasets for deep learning

Computer Science and Mathematical Modelling ◽

10.5604/01.3001.0013.6599 ◽

2019 ◽

Vol 0 (9/2019) ◽

pp. 13-18

Author(s):

Karol Antczak

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Real Data ◽

Training Data ◽

Generation Process ◽

Data Generation ◽

Artificial Data ◽

High Level ◽

Artificial Datasets

The paper discusses regularization properties of artificial data for deep learning. Artificial datasets allow to train neural networks in the case of a real data shortage. It is demonstrated that the artificial data generation process, described as injecting noise to high-level features, bears several similarities to existing regularization methods for deep neural networks. One can treat this property of artificial data as a kind of “deep” regularization. It is thus possible to regularize hidden layers of the network by generating the training data in a certain way.

Get full-text (via PubEx)