A Stacking-based Ensemble Learning Method for Outlier Detection

Mapping Intimacies ◽

10.21203/rs.3.rs-677845/v1 ◽

2021 ◽

Author(s):

Abdul Ahad Abro

Keyword(s):

Machine Learning ◽

Outlier Detection ◽

Area Under Curve ◽

Binary Classification ◽

Ensemble Classifier ◽

Performance Criteria ◽

Learning Problems ◽

Learning Method ◽

Rotation Forest ◽

Research Areas

Abstract Outlier detection is considered as one of the crucial research areas for data mining. Many methods have been studied widely and utilized for achieving better results in outlier detection from existing literature; however, the effects of these few ways are inadequate. In this paper, a stacking-based ensemble classifier has been proposed along with four base learners (namely, Rotation Forest, Random Forest, Bagging and Boosting) and a Meta-learner (namely, Logistic Regression) to progress the outlier detection performance. The proposedmechanism is evaluated on five datasets from the ODDS library by adopting five performance criteria. The experimental outcomes demonstrate that the proposed method outperforms than the conventional ensemble approaches concerning the accuracy, AUC (Area Under Curve), precision, recall and Fmeasure values.This method can be used for image recognition and machine learning problems, such as binary classification.

Download Full-text

Ens-PPI: A Novel Ensemble Classifier for Predicting the Interactions of Proteins Using Autocovariance Transformation from PSSM

BioMed Research International ◽

10.1155/2016/4563524 ◽

2016 ◽

Vol 2016 ◽

pp. 1-8 ◽

Cited By ~ 13

Author(s):

Zhen-Guo Gao ◽

Lei Wang ◽

Shi-Xiong Xia ◽

Zhu-Hong You ◽

Xin Yan ◽

...

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Biological Activities ◽

Ensemble Classifier ◽

Prediction Performance ◽

Protein Protein Interactions ◽

Rotation Forest ◽

Proposed Model ◽

Protein Amino Acids ◽

Scoring Matrix

Protein-Protein Interactions (PPIs) play vital roles in most biological activities. Although the development of high-throughput biological technologies has generated considerable PPI data for various organisms, many problems are still far from being solved. A number of computational methods based on machine learning have been developed to facilitate the identification of novel PPIs. In this study, a novel predictor was designed using the Rotation Forest (RF) algorithm combined with Autocovariance (AC) features extracted from the Position-Specific Scoring Matrix (PSSM). More specifically, the PSSMs are generated using the information of protein amino acids sequence. Then, an effective sequence-based features representation, Autocovariance, is employed to extract features from PSSMs. Finally, the RF model is used as a classifier to distinguish between the interacting and noninteracting protein pairs. The proposed method achieves promising prediction performance when performed on the PPIs ofYeast,H.pylori, andindependent datasets. The good results show that the proposed model is suitable for PPIs prediction and could also provide a useful supplementary tool for solving other bioinformatics problems.

Download Full-text

Binary classification of single qubits using quantum machine learning method

Journal of Physics Conference Series ◽

10.1088/1742-6596/2006/1/012020 ◽

2021 ◽

Vol 2006 (1) ◽

pp. 012020

Author(s):

Teng Yi ◽

Jie Wang ◽

Fufang Xu

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Machine Learning Method ◽

Learning Method ◽

Quantum Machine Learning

Download Full-text

Handling class overlapping to detect noisy instances in classification

The Knowledge Engineering Review ◽

10.1017/s0269888918000115 ◽

2018 ◽

Vol 33 ◽

Cited By ~ 2

Author(s):

Shivani Gupta ◽

Atul Gupta

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Binary Classification ◽

Vital Role ◽

Learning Problems ◽

Data Sets ◽

Class Noise ◽

Principal Contributor ◽

Real World Problems ◽

Noise Filters

AbstractAutomated machine classification will play a vital role in the machine learning and data mining. It is probable that each classifier will work well on some data sets and not so well in others, increasing the evaluation significance. The performance of the learning models will intensely rely on upon the characteristics of the data sets. The previous outcomes recommend that overlapping between classes and the presence of noise has the most grounded impact on the performance of learning algorithm. The class overlap problem is a critical problem in which data samples appear as valid instances of more than one class which may be responsible for the presence of noise in data sets.The objective of this paper is to comprehend better the data used as a part of machine learning problems so as to learn issues and to analyze the instances that are profoundly covered by utilizing new proposed overlap measures. The proposed overlap measures are Nearest Enemy Ratio, SubConcept Ratio, Likelihood Ratio and Soft Margin Ratio. To perform this experiment, we have created 438 binary classification data sets from real-world problems and computed the value of 12 data complexity metrics to find highly overlapped data sets. After that we apply measures to identify the overlapped instances and four noise filters to find the noisy instances. From results, we found that 60–80% overlapped instances are noisy instances in data sets by using four noise filters. We found that class overlap is a principal contributor to introduce class noise in data sets.

Download Full-text

Large-Scale Data Learning Method for Anomaly Detection using Machine Learning for Monitoring Vibration in Vehicle Equipment

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.140.480 ◽

2020 ◽

Vol 140 (6) ◽

pp. 480-487

Author(s):

Minoru Kondo

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Large Scale ◽

Learning Method ◽

Large Scale Data ◽

Scale Data

Download Full-text

Speech Organ Contour Extraction Using Real-Time MRI and Machine Learning Method

10.21437/interspeech.2019-1593 ◽

2019 ◽

Author(s):

Hironori Takemoto ◽

Tsubasa Goto ◽

Yuya Hagihara ◽

Sayaka Hamanaka ◽

Tatsuya Kitamura ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Machine Learning Method ◽

Learning Method ◽

Contour Extraction

Download Full-text

Learning and control

10.1093/oso/9780199674923.003.0026 ◽

2018 ◽

Author(s):

Ivan Herreros

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Brain Function ◽

Control Strategies ◽

Learning Problems ◽

Animal Learning ◽

Feed Forward Control ◽

Machine Learning Applications ◽

And Control

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.

Download Full-text

Precipitation Modeling for Extreme Weather Based on Sparse Hybrid Machine Learning and Markov Chain Random Field in a Multi-Scale Subspace

Water ◽

10.3390/w13091241 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1241

Author(s):

Ming-Hsi Lee ◽

Yenming J. Chen

Keyword(s):

Machine Learning ◽

Markov Chain ◽

Random Field ◽

Long Range ◽

Weather Conditions ◽

Extreme Weather ◽

Prediction Algorithm ◽

Learning Method ◽

Multi Scale ◽

Hybrid Machine

This paper proposes to apply a Markov chain random field conditioning method with a hybrid machine learning method to provide long-range precipitation predictions under increasingly extreme weather conditions. Existing precipitation models are limited in time-span, and long-range simulations cannot predict rainfall distribution for a specific year. This paper proposes a hybrid (ensemble) learning method to perform forecasting on a multi-scaled, conditioned functional time series over a sparse l1 space. Therefore, on the basis of this method, a long-range prediction algorithm is developed for applications, such as agriculture or construction works. Our findings show that the conditioning method and multi-scale decomposition in the parse space l1 are proved useful in resisting statistical variation due to increasingly extreme weather conditions. Because the predictions are year-specific, we verify our prediction accuracy for the year we are interested in, but not for other years.

Download Full-text

A Novel Ensemble Machine Learning Method to Detect Phishing Attack

2020 IEEE 23rd International Multitopic Conference (INMIC) ◽

10.1109/inmic50486.2020.9318210 ◽

2020 ◽

Author(s):

Abdul Basit ◽

Maham Zafar ◽

Abdul Rehman Javed ◽

Zunera Jalil

Keyword(s):

Machine Learning ◽

Machine Learning Method ◽

Learning Method ◽

Ensemble Machine Learning

Download Full-text

Integrating Machine/Deep Learning Methods and Filtering Techniques for Reliable Mineral Phase Segmentation of 3D X-ray Computed Tomography Images

Energies ◽

10.3390/en14154595 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4595

Author(s):

Parisa Asadi ◽

Lauren E. Beckingham

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Ct Images ◽

Ct Imaging ◽

Learning Method ◽

Learning Methods ◽

X Ray ◽

Machine Learning Methods ◽

Filtering Techniques

X-ray CT imaging provides a 3D view of a sample and is a powerful tool for investigating the internal features of porous rock. Reliable phase segmentation in these images is highly necessary but, like any other digital rock imaging technique, is time-consuming, labor-intensive, and subjective. Combining 3D X-ray CT imaging with machine learning methods that can simultaneously consider several extracted features in addition to color attenuation, is a promising and powerful method for reliable phase segmentation. Machine learning-based phase segmentation of X-ray CT images enables faster data collection and interpretation than traditional methods. This study investigates the performance of several filtering techniques with three machine learning methods and a deep learning method to assess the potential for reliable feature extraction and pixel-level phase segmentation of X-ray CT images. Features were first extracted from images using well-known filters and from the second convolutional layer of the pre-trained VGG16 architecture. Then, K-means clustering, Random Forest, and Feed Forward Artificial Neural Network methods, as well as the modified U-Net model, were applied to the extracted input features. The models’ performances were then compared and contrasted to determine the influence of the machine learning method and input features on reliable phase segmentation. The results showed considering more dimensionality has promising results and all classification algorithms result in high accuracy ranging from 0.87 to 0.94. Feature-based Random Forest demonstrated the best performance among the machine learning models, with an accuracy of 0.88 for Mancos and 0.94 for Marcellus. The U-Net model with the linear combination of focal and dice loss also performed well with an accuracy of 0.91 and 0.93 for Mancos and Marcellus, respectively. In general, considering more features provided promising and reliable segmentation results that are valuable for analyzing the composition of dense samples, such as shales, which are significant unconventional reservoirs in oil recovery.

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text