scholarly journals COMPARISON OF CLASSIFICATION TECHNIQUES ON HEART DISEASE DATA SET

2017 ◽  
Vol 8 (9) ◽  
pp. 276-280
Author(s):  
K.K .Revathi ◽  
Author(s):  
Guizhou Hu ◽  
Martin M. Root

Background No methodology is currently available to allow the combining of individual risk factor information derived from different longitudinal studies for a chronic disease in a multivariate fashion. This paper introduces such a methodology, named Synthesis Analysis, which is essentially a multivariate meta-analytic technique. Design The construction and validation of statistical models using available data sets. Methods and results Two analyses are presented. (1) With the same data, Synthesis Analysis produced a similar prediction model to the conventional regression approach when using the same risk variables. Synthesis Analysis produced better prediction models when additional risk variables were added. (2) A four-variable empirical logistic model for death from coronary heart disease was developed with data from the Framingham Heart Study. A synthesized prediction model with five new variables added to this empirical model was developed using Synthesis Analysis and literature information. This model was then compared with the four-variable empirical model using the first National Health and Nutrition Examination Survey (NHANES I) Epidemiologic Follow-up Study data set. The synthesized model had significantly improved predictive power ( x2 = 43.8, P < 0.00001). Conclusions Synthesis Analysis provides a new means of developing complex disease predictive models from the medical literature.


Circulation ◽  
2016 ◽  
Vol 133 (suppl_1) ◽  
Author(s):  
Nina P Paynter ◽  
Raji Balasubramanian ◽  
Shuba Gopal ◽  
Franco Giulianini ◽  
Leslie Tinker ◽  
...  

Background: Prior studies of metabolomic profiles and coronary heart disease (CHD) have been limited by relatively small case numbers and scant data in women. Methods: The discovery set examined 371 metabolites in 400 confirmed, incident CHD cases and 400 controls (frequency matched on age, race/ethnicity, hysterectomy status and time of enrollment) in the Women’s Health Initiative Observational Study (WHI-OS). All selected metabolites were validated in a separate set of 394 cases and 397 matched controls drawn from the placebo arms of the WHI Hormone Therapy trials and the WHI-OS. Discovery used 4 methods: false-discovery rate (FDR) adjusted logistic regression for individual metabolites, permutation corrected least absolute shrinkage and selection operator (LASSO) algorithms, sparse partial least squares discriminant analysis (PLS-DA) algorithms, and random forest algorithms. Each method was performed with matching factors only and with matching plus both medication use (aspirin, statins, anti-diabetics and anti-hypertensives) and traditional CHD risk factors (smoking, systolic blood pressure, diabetes, total and HDL cholesterol). Replication in the validation set was defined as a logistic regression coefficient of p<0.05 for the metabolites selected by 3 or 4 methods (tier 1), or a FDR adjusted p<0.05 for metabolites selected by only 1 or 2 methods (tier 2). Results: Sixty-seven metabolites were selected in the discovery data set (30 tier 1 and 37 tier 2). Twenty-six successfully replicated in the validation data set (21 tier 1 and 5 tier 2), with 25 significant with adjusting for matching factors only and 11 significant after additionally adjusting for medications and CHD risk factors. Validated metabolites included amino acids, sugars, nucleosides, eicosanoids, plasmologens, polyunsaturated phospholipids and highly saturated triglycerides. These include novel metabolites as well as metabolites such as glutamate/glutamine, which have been shown in other populations. Conclusions: Multiple metabolites in important physiological pathways with robust associations for risk of CHD in women were identified and replicated. These results may offer insights into biological mechanisms of CHD as well as identify potential markers of risk.


Fuzzy Systems ◽  
2017 ◽  
pp. 682-714 ◽  
Author(s):  
Swati Aggarwal ◽  
Venu Azad

In the medical field diagnosis of a disease at an early stage is very important. Nowadays soft computing techniques such as fuzzy logic, artificial neural network and Neuro- fuzzy networks are widely used for the diagnosis of various diseases at different levels. In this chapter, a hybrid neural network is designed to classify the heart disease data set the hybrid neural network consist of two types of neural network multilayer perceptron (MLP) and fuzzy min max (FMM) neural network arranged in a hierarchical manner. The hybrid system is designed for the dataset which contain the combination of continuous and non continuous attribute values. In the system the attributes with continuous values are classified using the FMM neural networks and attributes with non-continuous value are classified by using the MLP neural network and to synthesize the result the output of both the network is fed into the second MLP neural network to generate the final result.


Author(s):  
Swati Aggarwal ◽  
Venu Azad

In the medical field diagnosis of a disease at an early stage is very important. Nowadays soft computing techniques such as fuzzy logic, artificial neural network and Neuro- fuzzy networks are widely used for the diagnosis of various diseases at different levels. In this chapter, a hybrid neural network is designed to classify the heart disease data set the hybrid neural network consist of two types of neural network multilayer perceptron (MLP) and fuzzy min max (FMM) neural network arranged in a hierarchical manner. The hybrid system is designed for the dataset which contain the combination of continuous and non continuous attribute values. In the system the attributes with continuous values are classified using the FMM neural networks and attributes with non-continuous value are classified by using the MLP neural network and to synthesize the result the output of both the network is fed into the second MLP neural network to generate the final result.


Kybernetes ◽  
2019 ◽  
Vol 48 (9) ◽  
pp. 2006-2029
Author(s):  
Hongshan Xiao ◽  
Yu Wang

Purpose Feature space heterogeneity exists widely in various application fields of classification techniques, such as customs inspection decision, credit scoring and medical diagnosis. This paper aims to study the relationship between feature space heterogeneity and classification performance. Design/methodology/approach A measurement is first developed for measuring and identifying any significant heterogeneity that exists in the feature space of a data set. The main idea of this measurement is derived from a meta-analysis. For the data set with significant feature space heterogeneity, a classification algorithm based on factor analysis and clustering is proposed to learn the data patterns, which, in turn, are used for data classification. Findings The proposed approach has two main advantages over the previous methods. The first advantage lies in feature transform using orthogonal factor analysis, which results in new features without redundancy and irrelevance. The second advantage rests on samples partitioning to capture the feature space heterogeneity reflected by differences of factor scores. The validity and effectiveness of the proposed approach is verified on a number of benchmarking data sets. Research limitations/implications Measurement should be used to guide the heterogeneity elimination process, which is an interesting topic in future research. In addition, to develop a classification algorithm that enables scalable and incremental learning for large data sets with significant feature space heterogeneity is also an important issue. Practical implications Measuring and eliminating the feature space heterogeneity possibly existing in the data are important for accurate classification. This study provides a systematical approach to feature space heterogeneity measurement and elimination for better classification performance, which is favorable for applications of classification techniques in real-word problems. Originality/value A measurement based on meta-analysis for measuring and identifying any significant feature space heterogeneity in a classification problem is developed, and an ensemble classification framework is proposed to deal with the feature space heterogeneity and improve the classification accuracy.


Author(s):  
S Joshika

Yelp connects people to great local businesses in USA which maintains a site to search and find any business in USA. This helps user to compare the businesses based on the star ratings and reviews given by other users to identify the best company among the available according to their need. The data-set provided in Yelp challenge contains tip, review, users, check-in, and business details which is shortly called as TURBO set was used by the participants in various ways to find interesting patterns. This paper focuses various surveys made on pre-processing; sentiment analysis; sentiment classification techniques and various classification algorithms proposed that results better performance than the other existing algorithms. The survey papers have mostly applied the algorithms on yelp data-set and other papers have applied on different data’s. 


2012 ◽  
Vol 263-266 ◽  
pp. 3342-3347
Author(s):  
Nan Nan Xie ◽  
Fei Yan Chen ◽  
Kuo Zhao ◽  
Liang Hu

BP neural network is a widely used neural network, with advantages as adaptability, fault tolerance and self-organization. However, BP neural network is difficult to determine the network structure, and easy to fall into local minimum points. In this paper, an optimized BP neural network was proposed based on DS, he advantages of DS Evidential Reasoning on uncertain information are used to improve the recognition rate and credibility of BP. Experiments on Heart Disease Data set shows the proposed method have good performance on run time, prediction accuracy and robustness.


Sign in / Sign up

Export Citation Format

Share Document