Hybridization of PSO-ABC Based Ensemble Classification Model for High Dimensional Medical Datasets

Author(s):  
G. Lalitha Kumari ◽  
N. Naga Malleswara Rao
Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Jing Zhang ◽  
Guang Lu ◽  
Jiaquan Li ◽  
Chuanwen Li

Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.


2020 ◽  
Vol 43 (1) ◽  
pp. 103-125
Author(s):  
Yi Zhong ◽  
Jianghua He ◽  
Prabhakar Chalise

With the advent of high throughput technologies, the high-dimensional datasets are increasingly available. This has not only opened up new insight into biological systems but also posed analytical challenges. One important problem is the selection of informative feature-subset and prediction of the future outcome. It is crucial that models are not overfitted and give accurate results with new data. In addition, reliable identification of informative features with high predictive power (feature selection) is of interests in clinical settings. We propose a two-step framework for feature selection and classification model construction, which utilizes a nested and repeated cross-validation method. We evaluated our approach using both simulated data and two publicly available gene expression datasets. The proposed method showed comparatively better predictive accuracy for new cases than the standard cross-validation method.


2019 ◽  
Vol 15 (3) ◽  
pp. 497-514
Author(s):  
Jozef Michal Mintal ◽  
Róbert Vancel

AbstractSocial networking services (SNSs) can significantly impact public life during important political events. Thus, it comes as no surprise that different political actors try to exploit these online platforms for their benefit. Bots constitute a popular tool on SNSs that appears to be able to shape public opinion and disrupt political processes. However, the role of bots during political events in a non-Western context remains largely under-studied. This article addresses the question of the involvement of Twitter bots during electoral campaigns in Japan. In our study, we collected Twitter data over a fourteen-day period in October 2017 using a set of hashtags related to the 2017 Japanese general election. Our dataset includes 905,215 tweets, 665,400 of which were unique tweets. Using a supervised machine learning approach, we first built a custom ensemble classification model for bot detection based on user profile features, with an area under curve (AUC) for the test set of 0.998. Second, in applying our model, we estimate that the impact of Twitter bots in Japan was minor overall. In comparison with similar studies conducted during elections in the US and the UK, the deployment of Twitter bots involved in the 2017 Japanese general election seems to be significantly lower. Finally, given our results on the level of bots on Twitter during the 2017 Japanese general election, we provide various possible explanations for their underuse within a broader socio-political context.


Sign in / Sign up

Export Citation Format

Share Document