scholarly journals exprso: an R-package for the rapid implementation of machine learning algorithms

F1000Research ◽  
2017 ◽  
Vol 5 ◽  
pp. 2588 ◽  
Author(s):  
Thomas Quinn ◽  
Daniel Tylee ◽  
Stephen Glatt

Machine learning plays a major role in many scientific investigations. However, non-expert programmers may struggle to implement the elaborate pipelines necessary to build highly accurate and generalizable models. We introduce exprso, a new R package that is an intuitive machine learning suite designed specifically for non-expert programmers. Built initially for the classification of high-dimensional data, exprso uses an object-oriented framework to encapsulate a number of common analytical methods into a series of interchangeable modules. This includes modules for feature selection, classification, high-throughput parameter grid-searching, elaborate cross-validation schemes (e.g., Monte Carlo and nested cross-validation), ensemble classification, and prediction. In addition, exprso also supports multi-class classification (through the 1-vs-all generalization of binary classifiers) and the prediction of continuous outcomes.

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2588 ◽  
Author(s):  
Thomas Quinn ◽  
Daniel Tylee ◽  
Stephen Glatt

Machine learning plays a major role in many scientific investigations. However, non-expert programmers may struggle to implement the elaborate pipelines necessary to build highly accurate and generalizable models. We introduce here a new R package, exprso, as an intuitive machine learning suite designed specifically for non-expert programmers. Built primarily for the classification of high-dimensional data, exprso uses an object-oriented framework to encapsulate a number of common analytical methods into a series of interchangeable modules. This includes modules for feature selection, classification, high-throughput parameter grid-searching, elaborate cross-validation schemes (e.g., Monte Carlo and nested cross-validation), ensemble classification, and prediction. In addition, exprso provides native support for multi-class classification through the 1-vs-all generalization of binary classifiers. In contrast to other machine learning suites, we have prioritized simplicity of use over expansiveness when designing exprso.


2020 ◽  
Vol 10 (4) ◽  
pp. 220 ◽  
Author(s):  
Nicolina Sciaraffa ◽  
Manousos A. Klados ◽  
Gianluca Borghini ◽  
Gianluca Di Flumeri ◽  
Fabio Babiloni ◽  
...  

The need for automatic detection and classification of high-frequency oscillations (HFOs) as biomarkers of the epileptogenic tissue is strongly felt in the clinical field. In this context, the employment of artificial intelligence methods could be the missing piece to achieve this goal. This work proposed a double-step procedure based on machine learning algorithms and tested it on an intracranial electroencephalogram (iEEG) dataset available online. The first step aimed to define the optimal length for signal segmentation, allowing for an optimal discrimination of segments with HFO relative to those without. In this case, binary classifiers have been tested on a set of energy features. The second step aimed to classify these segments into ripples, fast ripples and fast ripples occurring during ripples. Results suggest that LDA applied to 10 ms segmentation could provide the highest sensitivity (0.874) and 0.776 specificity for the discrimination of HFOs from no-HFO segments. Regarding the three-class classification, non-linear methods provided the highest values (around 90%) in terms of specificity and sensitivity, significantly different to the other three employed algorithms. Therefore, this machine-learning-based procedure could help clinicians to automatically reduce the quantity of irrelevant data.


2008 ◽  
Vol 15 (2) ◽  
pp. 65-86 ◽  
Author(s):  
Ana Carolina Lorena ◽  
André C. P. L. F. De Carvalho

Several problems involve the classification of data into categories, also called classes. Given a dataset containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predictthe class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary problems. However, several problems requirethe discrimination of examples into more than two categories or classes. This paper surveys strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final classification.


Author(s):  
Hicham Riri ◽  
Mohammed Ed-Dhahraouy ◽  
Abdelmajid Elmoutaouakkil ◽  
Abderrahim Beni-Hssane ◽  
Farid Bourzgui

The purpose of this study is to investigate computer vision and machine learning methods for classification of orthodontic images in order to provide orthodontists with a solution for multi-class classification of patients’ images to evaluate the evolution of their treatment. Of which, we proposed three algorithms based on extracted features, such as facial features and skin colour using YCbCrcolour space, assigned to nodes of a decision tree to classify orthodontic images: an algorithm for intra-oral images, an algorithm for mould images and an algorithm for extra-oral images. Then, we compared our method by implementing the Local Binary Pattern (LBP) algorithm to extract textural features from images. After that, we applied the principal component analysis (PCA) algorithm to optimize the redundant parameters in order to classify LBP features with six classifiers; Quadratic Support Vector Machine (SVM), Cubic SVM, Radial Basis Function SVM, Cosine K-Nearest Neighbours (KNN), Euclidian KNN, and Linear Discriminant Analysis (LDA). The presented algorithms have been evaluated on a dataset of images of 98 different patients, and experimental results demonstrate the good performances of our proposed method with a high accuracy compared with machine learning algorithms. Where LDA classifier achieves an accuracy of 84.5%.


Author(s):  
Amandeep Singh Arora ◽  
Linesh Raja ◽  
Barkha Bahl

Cloud Security is a strong hindrance which discourage organizations to move toward cloud despite huge benefits. Denial of Service attacks [1] operated via distributed systems compromise availability of cloud services. Techniques to identify distributed denial of service attacks with minimized false positives is highly required to ensure availability of cloud services to genuine users. Classification of incoming requests and outgoing responses using machine learning algorithms is a quite effective way of detection and prevention. In this paper, Ten algorithms of machine learning have been evaluated for performance and detection accuracies. An estimation accuracy method known as F-Hold cross validation [2] is used for time efficient analysis.


2020 ◽  
Vol 25 (40) ◽  
pp. 4296-4302 ◽  
Author(s):  
Yuan Zhang ◽  
Zhenyan Han ◽  
Qian Gao ◽  
Xiaoyi Bai ◽  
Chi Zhang ◽  
...  

Background: β thalassemia is a common monogenic genetic disease that is very harmful to human health. The disease arises is due to the deletion of or defects in β-globin, which reduces synthesis of the β-globin chain, resulting in a relatively excess number of α-chains. The formation of inclusion bodies deposited on the cell membrane causes a decrease in the ability of red blood cells to deform and a group of hereditary haemolytic diseases caused by massive destruction in the spleen. Methods: In this work, machine learning algorithms were employed to build a prediction model for inhibitors against K562 based on 117 inhibitors and 190 non-inhibitors. Results: The overall accuracy (ACC) of a 10-fold cross-validation test and an independent set test using Adaboost were 83.1% and 78.0%, respectively, surpassing Bayes Net, Random Forest, Random Tree, C4.5, SVM, KNN and Bagging. Conclusion: This study indicated that Adaboost could be applied to build a learning model in the prediction of inhibitors against K526 cells.


2020 ◽  
Vol 10 (5) ◽  
pp. 1797 ◽  
Author(s):  
Mera Kartika Delimayanti ◽  
Bedy Purnama ◽  
Ngoc Giang Nguyen ◽  
Mohammad Reza Faisal ◽  
Kunti Robiatul Mahmudah ◽  
...  

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.


2021 ◽  
Vol 9 (5) ◽  
pp. 1034
Author(s):  
Carlos Sabater ◽  
Lorena Ruiz ◽  
Abelardo Margolles

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.


Sign in / Sign up

Export Citation Format

Share Document