Reducing Cognitive Overload by Meta-Learning Assisted Algorithm Selection

Author(s):  
Lisa Fan ◽  
Minxiao Lei

With the explosion of available data mining algorithms, a method for helping user to select the most appropriate algorithm or combination of algorithms to solve a given problem and reducing users’ cognitive overload due to the overloaded data mining algorithms is becoming increasingly important. This chapter presents a meta-learning approach to support users automatically selecting most suitable algorithms during data mining model building process. The authors discuss the meta-learning method in detail and present some empirical results that show the improvement that can be achieved with the hybrid model by combining meta-learning method and Rough Set feature reduction. The redundant properties of the dataset can be found. Thus, the ranking process can be sped up and accuracy can be increased by using the reduct of the properties of the dataset. With the reduced searching space, users’ cognitive load is reduced.

2017 ◽  
Vol 27 (4) ◽  
pp. 697-712 ◽  
Author(s):  
Besim Bilalli ◽  
Alberto Abelló ◽  
Tomàs Aluja-Banet

AbstractThe demand for performing data analysis is steadily rising. As a consequence, people of different profiles (i.e., nonexperienced users) have started to analyze their data. However, this is challenging for them. A key step that poses difficulties and determines the success of the analysis is data mining (model/algorithm selection problem). Meta-learning is a technique used for assisting non-expert users in this step. The effectiveness of meta-learning is, however, largely dependent on the description/characterization of datasets (i.e., meta-features used for meta-learning). There is a need for improving the effectiveness of meta-learning by identifying and designing more predictive meta-features. In this work, we use a method from exploratory factor analysis to study the predictive power of different meta-features collected in OpenML, which is a collaborative machine learning platform that is designed to store and organize meta-data about datasets, data mining algorithms, models and their evaluations. We first use the method to extract latent features, which are abstract concepts that group together meta-features with common characteristics. Then, we study and visualize the relationship of the latent features with three different performance measures of four classification algorithms on hundreds of datasets available in OpenML, and we select the latent features with the highest predictive power. Finally, we use the selected latent features to perform meta-learning and we show that our method improves the meta-learning process. Furthermore, we design an easy to use application for retrieving different meta-data from OpenML as the biggest source of data in this domain.


2019 ◽  
Vol 3 (2) ◽  
pp. 33
Author(s):  
Raheleh Hamedanizad ◽  
Elham Bahmani ◽  
Mojtaba Jamshidi ◽  
Aso Mohammad Darwesh

   Addiction to narcotics is one of the greatest health challenges in today’s world which has become a serious threat for social, economic, and cultural structures and has ruined a part of an active force of the society and it is one of the main factors of growth of diseases such as HIV and hepatitis. Today, addiction is known as a disease and welfare organization, and many of the dependent centers try to help the addicts treat this disease. In this study, using data mining algorithms and based on data collected from opioid withdrawal applicants referring to welfare organization, a prediction model is proposed to predict the success of opioid withdrawal applicants. In this study, the statistical population is comprised opioid withdrawal applicants in a welfare organization. This statistical population includes 26 features of 793 instances including men and women. The proposed model is a combination of meta-learning algorithms (decorate and bagging) and J48 decision tree implemented in Weka data mining software. The efficiency of the proposed model is evaluated in terms of precision, recall, Kappa, and root mean squared error and the results are compared with algorithms such as multilayer perceptron neural network, Naive Bayes, and Random Forest. The results of various experiments showed that the precision of the proposed model is 71.3% which is superior over the other compared algorithms.


2007 ◽  
Vol 46 (01) ◽  
pp. 05-18 ◽  
Author(s):  
Y. Lee ◽  
K. Dharmala ◽  
C.H. Lee

Summary Objectives: A number of controversial studies have been reported on the potential risk of breast cancer caused by hormone replacement therapy (HRT). Some studies showed a positive relationship between HRT and breast cancer onset, but other studies have not confirmed these results. To clarify the contradictory outcomes in the relationships between HRT and the onset of breast cancer, we have designed an intelligent data mining model (IDM), which is able to find proper prognostic factors for cancer onset and provides alternate measures in interpretation of outcome of clinical data through hierarchies of attributes. Methods: Based on the selection criteria, we selected 22 sets of random and case-control studies of the last 15 years, which identified any involvements of HRT with breast cancer. We analyzed the relationship between HRT and breast cancer using an IDM model consisting of data mining algorithms and public domain data mining tools. Prognostic factors which underline the major etiological dispositions of breast cancer were identified. Results: The variables which are closely associated with cancer onset to some degree are age 60-69, age at menopause 40-49, parity 0, age 40-49, and types of menopause oophorectomy. An implementation of IDM model on overall pooled data indicated that there is no significant relationship between breast cancer onset and HRT. It is suggested that HRT patients with specific physiological and pathological conditions related with the higher ranks of prognostic factors may have a greater chance to get breast cancer. Conclusion: The results of this study may guide biomedical research directed at establishing the causal relationships between various medications and their complications, allowing an accurate assessment of efficacy and side effects of new therapeutic treatment in clinical trials without reliance on a large control population.


2011 ◽  
Vol 179-180 ◽  
pp. 646-650
Author(s):  
Xiao Hong Han ◽  
Lei Wang ◽  
Pei Jun Zhang

This paper highlights the data mining components of SQL Server 2005 and the building of data mining process, completes the creation, training, and the corresponding predictions of data mining model, implements the operation of data mining using data mining algorithms, so the application program, relationship database and data mining are seamless integrated. SQL Server 2005 provides data mining solution with a powerful design and development platform, without too much acquaintance with data mining techniques and data mining algorithms.


2019 ◽  
Vol 14 (1) ◽  
pp. 21-26 ◽  
Author(s):  
Viswam Subeesh ◽  
Eswaran Maheswari ◽  
Hemendra Singh ◽  
Thomas Elsa Beulah ◽  
Ann Mary Swaroop

Background: The signal is defined as “reported information on a possible causal relationship between an adverse event and a drug, of which the relationship is unknown or incompletely documented previously”. Objective: To detect novel adverse events of iloperidone by disproportionality analysis in FDA database of Adverse Event Reporting System (FAERS) using Data Mining Algorithms (DMAs). Methodology: The US FAERS database consists of 1028 iloperidone associated Drug Event Combinations (DECs) which were reported from 2010 Q1 to 2016 Q3. We consider DECs for disproportionality analysis only if a minimum of ten reports are present in database for the given adverse event and which were not detected earlier (in clinical trials). Two data mining algorithms, namely, Reporting Odds Ratio (ROR) and Information Component (IC) were applied retrospectively in the aforementioned time period. A value of ROR-1.96SE>1 and IC- 2SD>0 were considered as the threshold for positive signal. Results: The mean age of the patients of iloperidone associated events was found to be 44years [95% CI: 36-51], nevertheless age was not mentioned in twenty-one reports. The data mining algorithms exhibited positive signal for akathisia (ROR-1.96SE=43.15, IC-2SD=2.99), dyskinesia (21.24, 3.06), peripheral oedema (6.67,1.08), priapism (425.7,9.09) and sexual dysfunction (26.6-1.5) upon analysis as those were well above the pre-set threshold. Conclusion: Iloperidone associated five potential signals were generated by data mining in the FDA AERS database. The result requires an integration of further clinical surveillance for the quantification and validation of possible risks for the adverse events reported of iloperidone.


Author(s):  
Ari Fadli ◽  
Azis Wisnu Widhi Nugraha ◽  
Muhammad Syaiful Aliim ◽  
Acep Taryana ◽  
Yogiek Indra Kurniawan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document