scholarly journals OPTIMIZATION OF EVALUATION OF THE INFORMATIVITY OF MEDICAL INDICATORS ON THE BASIS OF THE HYBRID APPROACH

2017 ◽  
pp. 108-115
Author(s):  
Є.В. БОДЯНСЬКИЙ ◽  
І.Г. ПЕРОВА ◽  
Г.В. СТОЙКА

Feature Selection task is one of most complicated and actual in Data Mining area. Any approaches for it solving are based on non-mathematical and presentative hypothesis. New approach for evaluation of medical features information quantity, based on optimal combination of Feature Selection and Feature Extraction methods. This approach permits to produce optimal reduced number of features with linguistic interpreting of each ones. Hybrid system of Feature Selection/Extraction is proposed. This system is numerically simple, can produce Feature Selection/ Extraction with any number of features using standard method of principal component analysis and calculating distance between first principal component and all medical features.

2018 ◽  
pp. 113-119
Author(s):  
Iryna Perova ◽  
Yevgeniy Bodyanskiy

Feature Selection task is one of the most complicated and actual in the areas of Data Mining and Human Machine Interaction. Many approaches to its solving are based on non-mathematical and presentative hypothesis. New approach to evaluation of medical features information quantity, based on optimized combination of feature selection and feature extraction methods is proposed. This approach allows us to produce optimal reduced number of features with linguistic interpreting of each of them. Hybrid system of feature selection/extraction based on Neural Network-Physician interaction is investigated. This system is numerically simple, can produce feature selection/extraction with any number of factors in online mode using neural network-physician interaction based on Oja’s neurons for online principal component analysis and calculating distance between first principal component and all input features. A series of experiments confirms efficiency of proposed approaches in Medical Data Mining area and allows physicians to have the most informative features without losing their linguistic interpreting.


Author(s):  
OLCAY KURSUN ◽  
OLEG V. FAVOROV

Feature selection and extraction are critical steps in many areas where pattern recognition techniques are applied. Feature selection and extraction are based on identifying and maximizing dependency relations. Gebelein's Maximal Correlation (GMC) is the most general form of dependence in that it does not make any statistical assumptions concerning the nature of the dependencies. Unfortunately, benefiting from such a useful measure in practice is generally impossible as there are only a few cases for which explicit formulae are available to calculate it. In this paper, we point out a parallel between GMC and the SINBAD algorithms, developed originally as a model of feature extraction for neurons in the cerebral cortex. We use SINBAD as a robust approximation to GMC to perform feature selection and extraction on a number of artificial and real datasets. We show that SINBAD estimates of GMC compare favorably to other well known feature selection and extraction methods based on mutual information, kernel canonical correlation analysis and principal component analysis.


2009 ◽  
Vol 147-149 ◽  
pp. 588-593 ◽  
Author(s):  
Marcin Derlatka ◽  
Jolanta Pauk

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.


Author(s):  
Mohammad M. Masud ◽  
Latifur Khan ◽  
Bhavani Thuraisingham

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.


Sign in / Sign up

Export Citation Format

Share Document