exprso: an R-package for the rapid implementation of machine learning algorithms

Machine learning plays a major role in many scientific investigations. However, non-expert programmers may struggle to implement the elaborate pipelines necessary to build highly accurate and generalizable models. We introduce exprso, a new R package that is an intuitive machine learning suite designed specifically for non-expert programmers. Built initially for the classification of high-dimensional data, exprso uses an object-oriented framework to encapsulate a number of common analytical methods into a series of interchangeable modules. This includes modules for feature selection, classification, high-throughput parameter grid-searching, elaborate cross-validation schemes (e.g., Monte Carlo and nested cross-validation), ensemble classification, and prediction. In addition, exprso also supports multi-class classification (through the 1-vs-all generalization of binary classifiers) and the prediction of continuous outcomes.

Download Full-text

exprso: an R-package for the rapid implementation of machine learning algorithms

F1000Research ◽

10.12688/f1000research.9893.1 ◽

2016 ◽

Vol 5 ◽

pp. 2588 ◽

Cited By ~ 4

Author(s):

Thomas Quinn ◽

Daniel Tylee ◽

Stephen Glatt

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Object Oriented ◽

R Package ◽

Machine Learning Algorithms ◽

Ensemble Classification ◽

Scientific Investigations ◽

Binary Classifiers ◽

Multi Class Classification

Machine learning plays a major role in many scientific investigations. However, non-expert programmers may struggle to implement the elaborate pipelines necessary to build highly accurate and generalizable models. We introduce here a new R package, exprso, as an intuitive machine learning suite designed specifically for non-expert programmers. Built primarily for the classification of high-dimensional data, exprso uses an object-oriented framework to encapsulate a number of common analytical methods into a series of interchangeable modules. This includes modules for feature selection, classification, high-throughput parameter grid-searching, elaborate cross-validation schemes (e.g., Monte Carlo and nested cross-validation), ensemble classification, and prediction. In addition, exprso provides native support for multi-class classification through the 1-vs-all generalization of binary classifiers. In contrast to other machine learning suites, we have prioritized simplicity of use over expansiveness when designing exprso.

Download Full-text

Double-Step Machine Learning Based Procedure for HFOs Detection and Classification

Brain Sciences ◽

10.3390/brainsci10040220 ◽

2020 ◽

Vol 10 (4) ◽

pp. 220 ◽

Cited By ~ 1

Author(s):

Nicolina Sciaraffa ◽

Manousos A. Klados ◽

Gianluca Borghini ◽

Gianluca Di Flumeri ◽

Fabio Babiloni ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Double Step ◽

Optimal Length ◽

Signal Segmentation ◽

High Frequency Oscillations ◽

Specificity And Sensitivity ◽

Step Procedure ◽

Binary Classifiers

The need for automatic detection and classification of high-frequency oscillations (HFOs) as biomarkers of the epileptogenic tissue is strongly felt in the clinical field. In this context, the employment of artificial intelligence methods could be the missing piece to achieve this goal. This work proposed a double-step procedure based on machine learning algorithms and tested it on an intracranial electroencephalogram (iEEG) dataset available online. The first step aimed to define the optimal length for signal segmentation, allowing for an optimal discrimination of segments with HFO relative to those without. In this case, binary classifiers have been tested on a set of energy features. The second step aimed to classify these segments into ripples, fast ripples and fast ripples occurring during ripples. Results suggest that LDA applied to 10 ms segmentation could provide the highest sensitivity (0.874) and 0.776 specificity for the discrimination of HFOs from no-HFO segments. Regarding the three-class classification, non-linear methods provided the highest values (around 90%) in terms of specificity and sensitivity, significantly different to the other three employed algorithms. Therefore, this machine-learning-based procedure could help clinicians to automatically reduce the quantity of irrelevant data.

Download Full-text

Multi-Class Classification of Turkish Texts with Machine Learning Algorithms

2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) ◽

10.1109/ismsit.2018.8567307 ◽

2018 ◽

Cited By ~ 3

Author(s):

Fatih Gurcan

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Multi Class Classification

Download Full-text

Estratégias para a Combinação de Classificadores Binários em Soluções Multiclasses

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.7016 ◽

2008 ◽

Vol 15 (2) ◽

pp. 65-86 ◽

Cited By ~ 1

Author(s):

Ana Carolina Lorena ◽

André C. P. L. F. De Carvalho

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Multiclass Problem ◽

Learning Techniques ◽

Binary Classifiers ◽

Binary Problems

Several problems involve the classification of data into categories, also called classes. Given a dataset containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predictthe class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary problems. However, several problems requirethe discrimination of examples into more than two categories or classes. This paper surveys strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final classification.

Download Full-text

Extracted features based multi-class classification of orthodontic images

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i4.pp3558-3567 ◽

2020 ◽

Vol 10 (4) ◽

pp. 3558

Author(s):

Hicham Riri ◽

Mohammed Ed-Dhahraouy ◽

Abdelmajid Elmoutaouakkil ◽

Abderrahim Beni-Hssane ◽

Farid Bourzgui

Keyword(s):

Machine Learning ◽

Local Binary Pattern ◽

Principal Component ◽

Machine Learning Algorithms ◽

Support Vector ◽

Linear Discriminant ◽

Nearest Neighbours ◽

Multi Class Classification ◽

Pca Algorithm

The purpose of this study is to investigate computer vision and machine learning methods for classification of orthodontic images in order to provide orthodontists with a solution for multi-class classification of patients’ images to evaluate the evolution of their treatment. Of which, we proposed three algorithms based on extracted features, such as facial features and skin colour using YCbCrcolour space, assigned to nodes of a decision tree to classify orthodontic images: an algorithm for intra-oral images, an algorithm for mould images and an algorithm for extra-oral images. Then, we compared our method by implementing the Local Binary Pattern (LBP) algorithm to extract textural features from images. After that, we applied the principal component analysis (PCA) algorithm to optimize the redundant parameters in order to classify LBP features with six classifiers; Quadratic Support Vector Machine (SVM), Cubic SVM, Radial Basis Function SVM, Cosine K-Nearest Neighbours (KNN), Euclidian KNN, and Linear Discriminant Analysis (LDA). The presented algorithms have been evaluated on a dataset of images of 98 different patients, and experimental results demonstrate the good performances of our proposed method with a high accuracy compared with machine learning algorithms. Where LDA classifier achieves an accuracy of 84.5%.

Download Full-text

Minimized False Alarm predictive Threshold for Cloud Service Providers

Recent Patents on Computer Science ◽

10.2174/2213275912666190819115320 ◽

2019 ◽

Vol 12 ◽

Author(s):

Amandeep Singh Arora ◽

Linesh Raja ◽

Barkha Bahl

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Service Providers ◽

Denial Of Service ◽

Cloud Service ◽

Machine Learning Algorithms ◽

Cloud Services ◽

Estimation Accuracy ◽

Denial Of Service Attacks

Cloud Security is a strong hindrance which discourage organizations to move toward cloud despite huge benefits. Denial of Service attacks [1] operated via distributed systems compromise availability of cloud services. Techniques to identify distributed denial of service attacks with minimized false positives is highly required to ensure availability of cloud services to genuine users. Classification of incoming requests and outgoing responses using machine learning algorithms is a quite effective way of detection and prevention. In this paper, Ten algorithms of machine learning have been evaluated for performance and detection accuracies. An estimation accuracy method known as F-Hold cross validation [2] is used for time efficient analysis.

Download Full-text

Performance Exploratory Comparative Study for Software Change and Effort Prediction in Object-Oriented Systems Using Statistical and Machine Learning Algorithms

SSRN Electronic Journal ◽

10.2139/ssrn.3166025 ◽

2018 ◽

Author(s):

Bhawana Mathur ◽

Manju Kaushik

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Learning Algorithms ◽

Object Oriented ◽

Machine Learning Algorithms ◽

Software Change ◽

Effort Prediction ◽

Object Oriented Systems

Download Full-text

Prediction of K562 Cells Functional Inhibitors Based on Machine Learning Approaches

Current Pharmaceutical Design ◽

10.2174/1381612825666191107092214 ◽

2020 ◽

Vol 25 (40) ◽

pp. 4296-4302 ◽

Cited By ~ 2

Author(s):

Yuan Zhang ◽

Zhenyan Han ◽

Qian Gao ◽

Xiaoyi Bai ◽

Chi Zhang ◽

...

Keyword(s):

Machine Learning ◽

Inclusion Bodies ◽

Cross Validation ◽

Independent Set ◽

K562 Cells ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

Validation Test ◽

Excess Number ◽

Fold Cross Validation

Background: β thalassemia is a common monogenic genetic disease that is very harmful to human health. The disease arises is due to the deletion of or defects in β-globin, which reduces synthesis of the β-globin chain, resulting in a relatively excess number of α-chains. The formation of inclusion bodies deposited on the cell membrane causes a decrease in the ability of red blood cells to deform and a group of hereditary haemolytic diseases caused by massive destruction in the spleen. Methods: In this work, machine learning algorithms were employed to build a prediction model for inhibitors against K562 based on 117 inhibitors and 190 non-inhibitors. Results: The overall accuracy (ACC) of a 10-fold cross-validation test and an independent set test using Adaboost were 83.1% and 78.0%, respectively, surpassing Bayes Net, Random Forest, Random Tree, C4.5, SVM, KNN and Bagging. Conclusion: This study indicated that Adaboost could be applied to build a learning model in the prediction of inhibitors against K526 cells.

Download Full-text

Classification of Brainwaves for Sleep Stages by High-Dimensional FFT Features from EEG Signals

Applied Sciences ◽

10.3390/app10051797 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1797 ◽

Cited By ~ 2

Author(s):

Mera Kartika Delimayanti ◽

Bedy Purnama ◽

Ngoc Giang Nguyen ◽

Mohammad Reza Faisal ◽

Kunti Robiatul Mahmudah ◽

...

Keyword(s):

Machine Learning ◽

Sleep Stage ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Sleep Stages ◽

Eeg Signals ◽

Stage Classification ◽

Sleep Stage Classification ◽

Low Dimensional

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.

Download Full-text

A Machine Learning Approach to Study Glycosidase Activities from Bifidobacterium

Microorganisms ◽

10.3390/microorganisms9051034 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1034

Author(s):

Carlos Sabater ◽

Lorena Ruiz ◽

Abelardo Margolles

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Human Milk Oligosaccharides ◽

Future Studies ◽

High Fiber ◽

Machine Learning Approach ◽

Prebiotic Oligosaccharides

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.

Download Full-text