Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Cesar Federico Caiafa; Jordi Solé-Casals; Pere Marti-Puig; Sun Zhe; Toshihisa Tanaka

doi:10.3390/app10238481

Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Applied Sciences ◽

10.3390/app10238481 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8481

Author(s):

Cesar Federico Caiafa ◽

Jordi Solé-Casals ◽

Pere Marti-Puig ◽

Sun Zhe ◽

Toshihisa Tanaka

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Unsupervised Classification ◽

Decomposition Methods ◽

Signal Decomposition ◽

Learning Performance ◽

Decomposition Approach ◽

Data Completion ◽

Machine Learning Applications

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.

Download Full-text

Learning and control

10.1093/oso/9780199674923.003.0026 ◽

2018 ◽

Author(s):

Ivan Herreros

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Brain Function ◽

Control Strategies ◽

Learning Problems ◽

Animal Learning ◽

Feed Forward Control ◽

Machine Learning Applications ◽

And Control

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.

Download Full-text

Newly Proposed Technique for Autism Spectrum Disorder based Machine Learning

International Journal of Computer Science and Information Technology ◽

10.5121/ijcsit.2021.13201 ◽

2021 ◽

Vol 13 (2) ◽

pp. 1-15

Author(s):

Sherif Kamel ◽

Rehab Al-harbi

Keyword(s):

Machine Learning ◽

Autism Spectrum Disorder ◽

Autism Spectrum ◽

Spectrum Disorder ◽

Screening Methods ◽

Disease Prediction ◽

Healthcare Organizations ◽

Machine Learning Applications ◽

Research Stage

The rapid growth in the number of autism disorder among toddlers needs for the development of easily implemented and effective screening methods. In this current era, the causes of Autism Spectrum Disorder (ASD) do not know yet, however, the diagnosis and detection of ASD is based on behaviours and symptoms. This paper aims to improve ASD disease prediction accuracy among toddlers by using the Logistic Regression model of Machine Learning, through the collected health care dataset and by using an algorithm for rapid classification of the behaviours to check whether the children are having autism diseases or not according to information in the dataset. Therefore, Machine Learning decreasing the time needed to detect the disorder, then providing the necessary health services early for infected toddlers to enhance their lifestyle. In healthcare, most machine learning applications are in the research stage, and to take the advantage of emerging software tools that incorporate artificial intelligence, healthcare organizations first need to overcome a variety of challenges.

Download Full-text

Multivariate mixed kernel density estimators and their application in machine learning for classification of biological objects based on spectral measurements

Computer Optics ◽

10.18287/2412-6179-2019-43-4-677-691 ◽

2019 ◽

Vol 43 (4) ◽

pp. 677-691

Author(s):

A.A. Sirota ◽

A.O. Donskikh ◽

A.V. Akimov ◽

D.A. Minakov

Keyword(s):

Machine Learning ◽

Density Estimation ◽

Data Augmentation ◽

Kernel Density ◽

Machine Learning Algorithms ◽

Spectral Measurements ◽

Density Estimates ◽

Biological Objects ◽

Density Estimators

A problem of non-parametric multivariate density estimation for machine learning and data augmentation is considered. A new mixed density estimation method based on calculating the convolution of independently obtained kernel density estimates for unknown distributions of informative features and a known (or independently estimated) density for non-informative interference occurring during measurements is proposed. Properties of the mixed density estimates obtained using this method are analyzed. The method is compared with a conventional Parzen-Rosenblatt window method applied directly to the training data. The equivalence of the mixed kernel density estimator and the data augmentation procedure based on the known (or estimated) statistical model of interference is theoretically and experimentally proven. The applicability of the mixed density estimators for training of machine learning algorithms for the classification of biological objects (elements of grain mixtures) based on spectral measurements in the visible and near-infrared regions is evaluated.

Download Full-text

A combined approach of convolutional neural networks and machine learning for visual fault classification in photovoltaic modules

Proceedings of the Institution of Mechanical Engineers Part O Journal of Risk and Reliability ◽

10.1177/1748006x211020305 ◽

2021 ◽

pp. 1748006X2110203

Author(s):

Sridharan Naveen Venkatesh ◽

Vaithiyanathan Sugumaran

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Performance Comparison ◽

Image Features ◽

Image Feature ◽

Fault Classification ◽

Photovoltaic Modules ◽

Accurate Performance ◽

Deep Cnn

Fault diagnosis plays a significant role in enhancing the useful lifetime, power output, and reliability of photovoltaic modules (PVM). Visual faults such as burn marks, delamination, discoloration, glass breakage, and snail trails make detection of faults difficult under harsh environmental conditions. Various researchers have made several attempts to identify visual faults in a PVM. However, much of the previous studies were centered on the identification and analysis of limited number of faults. This article presents the use of a deep convolutional neural network (CNN) to extract image features and perform an effective classification of faults by machine learning (ML) algorithms. In contrast to the present-day work, five different fault conditions were considered in the study. The proposed solution consists of three phases, to effectively analyze various PVM defects. First, the module images are acquired using unmanned aerial vehicles (UAVs) and data augmentation is performed to generate a uniform dataset. Afterward, a pre-trained deep CNN is adopted for image feature extraction. Finally, the extracted image features are classified with the help of various ML classifiers. The final results show the effectiveness of pre-trained deep CNN and accurate performance of ML classifiers. The best-in-class ML classifier for multiple fault classification is suggested based on the performance comparison.

Download Full-text

A Comparative Study of the Impact of Data Augmentation in Machine Learning Based Classification Accuracy

10.32920/ryerson.14661300 ◽

2021 ◽

Author(s):

Arif Jahangir

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Machine Learning Algorithms ◽

Quadratic Discriminant Analysis ◽

Crucial Importance ◽

Markov Transition Matrix ◽

Original Dataset ◽

Markov Transition ◽

The Impact

Traumatic Brain Injury is the primary cause of death and disability all over the world. Monitoring the intracranial pressure (ICP) and classifying it for hypertension signals is of crucial importance. This thesis explores the possibility of a better classification of the ICP signal and detection of hypertensive signal prior to the actual occurrence of the hypertensive episodes. This study differ from other approaches astime series is converted into images by Gramian angular field and Markov transition matrix and augmented with data. Due to unbalanced data, the effect of smote extended nearest neighbour algorithm for balancing the data is examined. We use various machine learning algorithms to classify the ICP signals. The results obtained shoe that Ada boost performance is the best among compared algorithms. F1 score of the Ada boost is 0.95 on original dataset, and 0.9967 on balanced and augmented dataset. Quadratic Discriminant Analysis F1 score is 1 when data is augmented and balanced.

Download Full-text

Data augmentation methods for machine-learning-based classification of bio-signals

2017 10th Biomedical Engineering International Conference (BMEiCON) ◽

10.1109/bmeicon.2017.8229109 ◽

2017 ◽

Cited By ~ 1

Author(s):

Asuka Sakai ◽

Yuki Minoda ◽

Koji Morikawa

Keyword(s):

Machine Learning ◽

Data Augmentation

Download Full-text

Classification of Alzheimers’ Dementia by Using Various Signal Decomposition Methods

10.1109/tiptekno53239.2021.9633007 ◽

2021 ◽

Author(s):

Ozlem Karabiber Cura ◽

Gulce Cosku Yilmaz ◽

Hatice Sabiha Ture ◽

Aydin Akan

Keyword(s):

Decomposition Methods ◽

Signal Decomposition

Download Full-text

Comparison of EEG signal decomposition methods in classification of motor-imagery BCI

Multimedia Tools and Applications ◽

10.1007/s11042-017-5586-9 ◽

2018 ◽

Vol 77 (16) ◽

pp. 21305-21327 ◽

Cited By ~ 8

Author(s):

Eltaf Abdalsalam Mohamed ◽

Mohd Zuki Yusoff ◽

Aamir Saeed Malik ◽

Mohammad Rida Bahloul ◽

Dalia Mahmoud Adam ◽

...

Keyword(s):

Motor Imagery ◽

Decomposition Methods ◽

Signal Decomposition ◽

Eeg Signal

Download Full-text

Unsupervised classification of specialty coffees in Homogeneous sensory attributes through machine learning

Coffee Science ◽

10.25186/.v15i.1780 ◽

2020 ◽

Vol 15 ◽

pp. 1-9

Author(s):

Paulo Cesar Ossani ◽

Diogo Francisco Rossoni ◽

Marcelo Ângelo Cirillo ◽

Flávio Meira Borém

Keyword(s):

Machine Learning ◽

Unsupervised Classification ◽

Sensory Attributes

Download Full-text

Uncertainty in Machine Learning Applications: A Practice-Driven Classification of Uncertainty

Developments in Language Theory - Lecture Notes in Computer Science ◽

10.1007/978-3-319-99229-7_36 ◽

2018 ◽

pp. 431-438 ◽

Cited By ~ 8

Author(s):

Michael Kläs ◽

Anna Maria Vollmer

Keyword(s):

Machine Learning ◽

Machine Learning Applications

Download Full-text