Using an Ensemble Learning Approach on Traditional Machine Learning Methods to Solve a Multi-Label Classification Problem

Author(s):  
Siddharth Basu ◽  
Sanjay Kumar ◽  
Sirjanpreet Singh Banga ◽  
Harshit Garg
2014 ◽  
Vol 5 (3) ◽  
pp. 82-96 ◽  
Author(s):  
Marijana Zekić-Sušac ◽  
Sanja Pfeifer ◽  
Nataša Šarlija

Abstract Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.


2020 ◽  
Author(s):  
Dalin Yang ◽  
Keum-Shik Hong

Abstract Background: Mild cognitive impairment (MCI) is considered a prodromal stage of Alzheimer’s disease, which is the sixth leading cause of death in the United State. Early diagnosis of MCI can allow for treatment to improve cognitive function and reduce modifiable risk factors. Currently, the combination of machine learning and neuroimaging plays a role in identifying and understanding neuropathological diseases. However, some challenges still remain, and these limitations need to be optimized for clinical MCI diagnosis. Methods: In this study, for stable identification with functional near-infrared spectroscopy (fNIRS) using the minimum resting-state time, nine different measurement durations (i.e., 30, 60, 90, 120, 150, 180, 210, 240, and 270 s) were evaluated based on 30 s intervals using a traditional machine learning approach and graph theory analysis. The machine learning methods were trained using temporal features of the resting-state fNIRS signal and included linear discriminant analysis (LDA), support vector machine, and K-nearest neighbor (KNN) algorithms. To enhance the diagnostic accuracy, feature representation- and classification-based transfer learning (TL) methods were used to detect MCI from the healthy controls through the input of connectivity maps with 30 and 90 s durations. Results: As in the results of the traditional machine learning and graph theory analysis, there was no significant difference among the different time windows. The accuracy of the conventional machine learning methods ranged from 55.76% (KNN, 120 s) to 67.00% (LDA, 90 s). The feature representation-based TL showed improved accuracy in both the 30 and 90 s cases (i.e., mean accuracy of 30 s: 79.37%, mean accuracy of 30 s: 74.05%). Notably, the classification-based TL method achieved the highest accuracy of 97.01% using the VGG19 pre-trained CNN model trained with the 30 s duration connectivity map. Conclusion: The results indicate that a 30 s measurement of the resting state with fNIRS could be used to detect MCI. Moreover, the combination of neuroimaging (e.g., functional connectivity maps) and deep learning methods (e.g., CNN and TL) may be considered as novel biomarkers for clinical computer-assisted MCI diagnosis.


2007 ◽  
Vol 33 (3) ◽  
pp. 397-427 ◽  
Author(s):  
Raquel Fernández ◽  
Jonathan Ginzburg ◽  
Shalom Lappin

In this article we use well-known machine learning methods to tackle a novel task, namely the classification of non-sentential utterances (NSUs) in dialogue. We introduce a fine-grained taxonomy of NSU classes based on corpus work, and then report on the results of several machine learning experiments. First, we present a pilot study focused on one of the NSU classes in the taxonomy—bare wh-phrases or “sluices”—and explore the task of disambiguating between the different readings that sluices can convey. We then extend the approach to classify the full range of NSU classes, obtaining results of around an 87% weighted F-score. Thus our experiments show that, for the taxonomy adopted, the task of identifying the right NSU class can be successfully learned, and hence provide a very encouraging basis for the more general enterprise of fully processing NSUs.


Author(s):  
I. Kaczmarek ◽  
A. Iwaniak ◽  
A. Świetlicka ◽  
M. Piwowarczyk ◽  
F. Harvey

Abstract. Spatial development plans provide an important information on future land development capabilities. Unfortunately, at the moment access to planning information in Poland is limited. Despite many initiatives taken to standardize planning documents, the standard for recording plans has not yet been developed. Each of the planning areas has a symbol and a category of land use, which is different in each of the plans. For this reason, it is very difficult to carry out an analysis enabling aggregation of all areas with a specific, the same development function.The authors in the article conduct experiments aimed at using machine learning methods for the needs of processing the text part of plans and their classification. The main aim was to find the best method for grouping texts of zones with the same land use. The experiment consists in an attempt to automatically classify the texts of findings for individual areas into the 10 defined categories of land use. Thanks to this, it is possible to predict the future land use function for a specific zone text regulation and aggregate all zones with specific land use type.In the proposed solution for the classification problem of heterogeneous planning information authors used k-means algorithm and artificial neural networks. The main challenge for this solution, however, was not the design of the classification tool but rather the preprocessing of the text. In this paper an approach for text preprocessing as well as selected methods of text classification is presented. The results of the work indicate greater use of CNN's usability to solve the problem presented. K-means clustering produces clusters, in which texts are not grouped according to land use function, which is not useful in the context of zones aggregation.


2020 ◽  
Author(s):  
Toni Lange ◽  
Guido Schwarzer ◽  
Thomas Datzmann ◽  
Harald Binder

AbstractBackgroundUpdating systematic reviews is often a time-consuming process involving a lot of human effort and is therefore not carried out as often as it should be. Our aim was therefore to explore the potential of machine learning methods to reduce the human workload, and to particularly also gauge the performance of deep learning methods as compared to more established machine learning methods.MethodsWe used three available reviews of diagnostic test studies as data basis. In order to identify relevant publications we used typical text pre-processing methods. The reference standard for the evaluation was the human-consensus based binary classification (inclusion, exclusion). For the evaluation of models various scenarios were generated using a grid of combinations of data preprocessing steps. Furthermore, we evaluated each machine learning approach with an approach-specific predefined grid of tuning parameters using the Brier score metric.ResultsThe best performance was obtained with an ensemble method for two of the reviews, and by a deep learning approach for the other review. Yet, the final performance of approaches is seen to strongly depend on data preparation. Overall, machine learning methods provided reasonable classification.ConclusionIt seems possible to reduce the human workload in updating systematic reviews by using machine learning methods. Yet, as the influence of data preprocessing on the final performance seems to be at least as important as choosing the specific machine learning approach, users should not blindly expect good performance just by using approaches from a popular class, such as deep learning.


Author(s):  
Andrius Daranda ◽  
Gintautas Dzemyda

Machine learning is compelling in solving various applied problems. Nevertheless, machine learning methods lack the contextual reasoning capabilities and cannot be fitted to utilize additional information about circumstances, environments, backgrounds, etc. Such information provides essential knowledge about possible reasons for particular actions. This knowledge could not be processed directly by either machine learning methods. This paper presents the context-aware machine learning approach for actor behavior contextual reasoning analysis and context-based prediction for threat assessment. Moreover, the proposed approach uses context-aware prediction to tackle the interaction between actors. An idea of the technique lies in the cooperative use of two classification methods when one way predicts an actor’s behavior. The second method discloses such predicted action (behavior) that is non-typical or unusual. Such integration of two-method allows the actor to make the self-awareness threat assessment based on relations between different actors where some multidimensional numerical data define the connections. This approach predicts the possible further situation and makes its threat assessment without any waiting for future actions. The suggested approach is based on the Decision Tree and Support Vector Method algorithm. Due to the complexity of context, marine traffic data was chosen to demonstrate the proposed approach capability. This technique could deal with the end-to-end approach for safe vessel navigation in maritime traffic with considerable ship congestion.


2018 ◽  
Vol 3 (2) ◽  
pp. 444
Author(s):  
Prikazchikova A.S. ◽  
Prikazchikova G.S.

The article considers the binary classification problem of economic security objects on the credit institutions example, for which it is proposed to use machine learning methods. In the study process the expediency of one of the methods of machine learning — the method of k-nearest neighbors — was proved to solve this problem, its efficiency amounted to 84 %. Key words: machine learning methods, financial statements, performance indicators, credit institutions, binary classification, k-nearest neighbors method.


Sign in / Sign up

Export Citation Format

Share Document