A Decision Cluster Classification Model with Model Selection Strategy

Author(s):  
Zhaocai Sun ◽  
Yunming Ye ◽  
Zhexue Huang ◽  
Wu Chen ◽  
Chunshan Li
2012 ◽  
Vol 12 (8) ◽  
pp. 2550-2565 ◽  
Author(s):  
Marcelo N. Kapp ◽  
Robert Sabourin ◽  
Patrick Maupin

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7752
Author(s):  
Jose M. Celaya-Padilla ◽  
Jonathan S. Romero-González ◽  
Carlos E. Galvan-Tejada ◽  
Jorge I. Galvan-Tejada ◽  
Huizilopoztli Luna-García ◽  
...  

Worldwide, motor vehicle accidents are one of the leading causes of death, with alcohol-related accidents playing a significant role, particularly in child death. Aiming to aid in the prevention of this type of accidents, a novel non-invasive method capable of detecting the presence of alcohol inside a motor vehicle is presented. The proposed methodology uses a series of low-cost alcohol MQ3 sensors located inside the vehicle, whose signals are stored, standardized, time-adjusted, and transformed into 5 s window samples. Statistical features are extracted from each sample and a feature selection strategy is carried out using a genetic algorithm, and a forward selection and backwards elimination methodology. The four features derived from this process were used to construct an SVM classification model that detects presence of alcohol. The experiments yielded 7200 samples, 80% of which were used to train the model. The rest were used to evaluate the performance of the model, which obtained an area under the ROC curve of 0.98 and a sensitivity of 0.979. These results suggest that the proposed methodology can be used to detect the presence of alcohol and enforce prevention actions.


2019 ◽  
Vol 67 (2) ◽  
pp. 111-116
Author(s):  
Fabiha Binte Farooq ◽  
Md Jamil Hasan Karami

Often in survival regression modelling, not all predictors are relevant to the outcome variable. Discarding such irrelevant variables is very crucial in model selection. In this research, under Cox Proportional Hazards (PH) model we study different model selection criteria including Stepwise selection, Least Absolute Shrinkage and Selection Operator (LASSO), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and the extended versions of AIC and BIC to the Cox model. The simulation study shows that varying censoring proportions and correlation coefficients among the covariates have great impact on the performances of the criteria to identify a true model. In the presence of high correlation among the covariates, the success rate for identifying the true model is higher for LASSO compared to other criteria. The extended version of BIC always shows better result than the traditional BIC. We have also applied these techniques to real world data. Dhaka Univ. J. Sci. 67(2): 111-116, 2019 (July)


2019 ◽  
Vol 8 (2) ◽  
pp. 5969-5971

Feature selection is the most important step to develop any latest learning model. As the complexity of the leaning models increases day by day there is an increasing demand, in selecting the right features to build the model. There are many methods for feature selection. A new feature selection based on the Manova statistical test is implemented. Using the Manova test, we select attributes from academic datasets. Using the selected attributes, we build a classification model. Accuracy of the model with feature selection is compared with a model with all attributes. Results are discussed. It is proved that the classification model build with features selected by Manova test achieves more accuracy than a model built with all features.


2020 ◽  
Vol 20 (S14) ◽  
Author(s):  
Ming Liang ◽  
ZhiXing Zhang ◽  
JiaYing Zhang ◽  
Tong Ruan ◽  
Qi Ye ◽  
...  

Abstract Background Laboratory indicator test results in electronic health records have been applied to many clinical big data analysis. However, it is quite common that the same laboratory examination item (i.e., lab indicator) is presented using different names in Chinese due to the translation problem and the habit problem of various hospitals, which results in distortion of analysis results. Methods A framework with a recall model and a binary classification model is proposed, which could reduce the alignment scale and improve the accuracy of lab indicator normalization. To reduce alignment scale, tf-idf is used for candidate selection. To assure the accuracy of output, we utilize enhanced sequential inference model for binary classification. And active learning is applied with a selection strategy which is proposed for reducing annotation cost. Results Since our indicator standardization method mainly focuses on Chinese indicator inconsistency, we perform our experiment on Shanghai Hospital Development Center and select clinical data from 8 hospitals. The method achieves a F1-score 92.08$$\%$$ % in our final binary classification. As for active learning, the new strategy proposed performs better than random baseline and could outperform the result trained on full data with only 43$$\%$$ % training data. A case study on heart failure clinic analysis conducted on the sub-dataset collected from SHDC shows that our proposed method is practical in the application with good performance. Conclusion This work demonstrates that the structure we proposed can be effectively applied to lab indicator normalization. And active learning is also suitable for this task for cost reduction. Such a method is also valuable in data cleaning, data mining, text extracting and entity alignment.


2021 ◽  
Vol 12 (3) ◽  
pp. 215-232
Author(s):  
Heng Xiao ◽  
Toshiharu Hatanaka

Swarm intelligence is inspired by natural group behavior. It is one of the promising metaheuristics for black-box function optimization. Then plenty of swarm intelligence algorithms such as particle swarm optimization (PSO) and firefly algorithm (FA) have been developed. Since these swarm intelligence models have some common properties and inherent characteristics, model hybridization is expected to adjust a swarm intelligence model for the target problem instead of parameter tuning that needs some trial and error approach. This paper proposes a PSO-FA hybrid algorithm with a model selection strategy. An event-driven trigger based on the personal best update makes each individual do the model selection that focuses on the personal study process. By testing the proposed hybrid algorithm on some benchmark problems and comparing it with a simple PSO, the standard PSO 2011, FA, HFPSO to show how the proposed hybrid swarm averagely performs well in black-box optimization problems.


2017 ◽  
Vol 2017 ◽  
pp. 1-15 ◽  
Author(s):  
Zhengping Liang ◽  
Rui Guo ◽  
Jiangtao Sun ◽  
Zhong Ming ◽  
Zexuan Zhu

Ant colony optimization (ACO) algorithms have been successfully applied to identify classification rules in data mining. This paper proposes a new ant colony optimization algorithm, named hmAntMinerorder, for the hierarchical multilabel classification problem in protein function prediction. The proposed algorithm is characterized by an orderly roulette selection strategy that distinguishes the merits of the data attributes through attributes importance ranking in classification model construction. A new pheromone update strategy is introduced to prevent the algorithm from getting trapped in local optima and thus leading to more efficient identification of classification rules. The comparison studies to other closely related algorithms on 16 publicly available datasets reveal the efficiency of the proposed algorithm.


Sign in / Sign up

Export Citation Format

Share Document