scholarly journals Clustering of Cardiovascular Disease Patients Using Data Mining Techniques with Principal Component Analysis and K-Medoids Clustering of Cardiovascular Disease Patients Using Data Mining Techniques with Principal Component Analysis and K-Medoids

Author(s):  
Edy Irwansyah ◽  
Ebiet Salim Pratama ◽  
Margaretha Ohyver

Cardiovascular disease is the number one cause of death in the world and Quoting from WHO, around 31% of deaths in the world are caused by cardiovascular diseases and more than 75% of deaths occur in developing countries. The results of patients with cardiovascular disease produce many medical records that can be used for further patient management. This study aims to develop a method of data mining by grouping patients with cardiovascular disease to determine the level of patient complications in the two clusters. The method applied is principal component analysis (PCA) which aims to reduce the dimensions of the large data available and the techniques of data mining in the form of cluster analysis which implements the K-Medoids algorithm. The results of data reduction with PCA resulted in five new components with a cumulative proportion variance of 0.8311. The five new components are implemented for cluster formation using the K-Medoids algorithm which results in the form of two clusters with a silhouette coefficient of 0.35. Combination of techniques of Data reduction by PCA and the application of the K-Medoids clustering algorithm are new ways for grouping data of patients with cardiovascular disease based on the level of patient complications in each cluster of data generated.

2017 ◽  
Vol 25 (6) ◽  
pp. 949-966 ◽  
Author(s):  
G Asencio-Cortés ◽  
F Martínez-Álvarez ◽  
A Morales-Esteban ◽  
J Reyes ◽  
A Troncoso

Abstract Increasing attention has been paid to the prediction of earthquakes with data mining techniques during the last decade. Several works have already proposed the use of certain features serving as inputs for supervised classifiers. However, they have been successfully used without any further transformation so far. In this work, the use of principal component analysis (PCA) to reduce data dimensionality and generate new datasets is proposed. In particular, this step is inserted in a successfully already used methodology to predict earthquakes. Tokyo, one of the cities mostly threatened by large earthquakes occurrence in Japan, is studied. Several well-known classifiers combined with PCA have been used. Noticeable improvement in the results is reported.


2009 ◽  
Vol 147-149 ◽  
pp. 588-593 ◽  
Author(s):  
Marcin Derlatka ◽  
Jolanta Pauk

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.


1990 ◽  
Vol 1 (3) ◽  
pp. 131-144
Author(s):  
María Coscarón

Cluster analysis by four methods and a principal component analysis were performed using data on 24 morphological characters of 27 species of the genus Rasahus (Peiratinae). The results obtained by the different techniques show general agreement. They confirm the present number of taxa and reveal the existence within the genus of three groups of species: scutellaris , hamatus and vittatus. The scutellaris group is constituted by R. aeneus (Walker), R. maculipennis (Lepelletier and Serville), R. bifurcatas Champion, R. castaneus Coscarón, R. guttatipennis (Stål), R. flavovittarus Stål, R. costarricensis Coscarón, R. scutellaris (Fabricius), R. atratus Coscarón, R. peruensis Coscarón, R. paraguayensis Coscarón, R. surinamensis Coscarón, R. albomaculatus Mayr, R. brasiliensis Coscarón and R. sulcicollis (Serville).The hamatus group contains R. rufiventris (Walker), R. hamatus (Fabricius), R. amapaensis Coscarón, R. arcitenens Stål, R. limai Pinto, R. angulatus coscarón, R. thoracicus Stål, R. biguttatus (Say), R. arcuiger (Stål), R. argentinensis Coscarón and R. grandis Fallou. The vittatus group contains R. vittatus Coscarón. The characters used to separate the groups of species are: shape of the pygophore, shape of the parameres, basal plate complexity, shape of the postocular region and hemelytra pattern. Illustrations of the structures of major diagnostic importance are included.


2021 ◽  
Author(s):  
Anwar Yahya Ebrahim ◽  
Hoshang Kolivand

The authentication of writers, handwritten autograph is widely realized throughout the world, the thorough check of the autograph is important before going to the outcome about the signer. The Arabic autograph has unique characteristics; it includes lines, and overlapping. It will be more difficult to realize higher achievement accuracy. This project attention the above difficulty by achieved selected best characteristics of Arabic autograph authentication, characterized by the number of attributes representing for each autograph. Where the objective is to differentiate if an obtain autograph is genuine, or a forgery. The planned method is based on Discrete Cosine Transform (DCT) to extract feature, then Spars Principal Component Analysis (SPCA) to selection significant attributes for Arabic autograph handwritten recognition to aid the authentication step. Finally, decision tree classifier was achieved for signature authentication. The suggested method DCT with SPCA achieves good outcomes for Arabic autograph dataset when we have verified on various techniques.


Author(s):  
Kalyani Kadam ◽  
Pooja Vinayak Kamat ◽  
Amita P. Malav

Cardiovascular diseases (CVDs) have turned out to be one of the life-threatening diseases in recent times. The key to effectively managing this is to analyze a huge amount of datasets and effectively mine it to predict and further prevent heart-related diseases. The primary objective of this chapter is to understand and survey various information mining strategies to efficiently determine occurrence of CVDs and also propose a big data architecture for the same. The authors make use of Apache Spark for the implementation.


2019 ◽  
Vol 8 (5) ◽  
pp. 136
Author(s):  
John Rennie Short ◽  
Justin Vélez-Hagan ◽  
Leah Dubots

There are now a wide variety of global indicators that measure different economic, political and social attributes of countries in the world. This paper seeks to answer two questions. First, what is the degree of overlap between these different measures? Are they, in fact, measuring the same underlying dimension? To answer this question, we employ a principal component analysis (PCA) to 15 indices across 145 countries. The results demonstrate that there is one underlying dimension that combines economic development and social progress with state stability. Second, how do countries score on this dimension? The results of the PCA allow us to produce categorical divisions of the world. The threefold division identifies a world composed of what we describe and map as rich, poor and middle countries. A five-group classification provided a more nuanced categorization described as: The very rich, free and stable; affluent and free; upper middle; lower middle; poor and not free.


2020 ◽  
Vol 1 ◽  
pp. 2385-2394
Author(s):  
M. Schöberl ◽  
E. Rebentisch ◽  
J. Trauer ◽  
M. Mörtl ◽  
J. Fottner

AbstractAs model-based systems engineering (MBSE) is evolving, the need for evaluating MBSE approaches grows. Literature shows that there is an untested assertion in the MBSE community that complexity drives the adoption of MBSE. To assess this assertion and support the evaluation of MBSE, a principal component analysis was carried out on eight product and development characteristics using data collected in an MBSE course, resulting in organizational complexity, product complexity and inertia. To conclude, the method developed in this paper enables organisations to evaluate their MBSE adoption potential.


2016 ◽  
Vol 19 (03) ◽  
pp. 382-390 ◽  
Author(s):  
Martina Siena ◽  
Alberto Guadagnini ◽  
Ernesto Della Rossa ◽  
Andrea Lamberti ◽  
Franco Masserano ◽  
...  

Summary We present and test a new screening methodology to discriminate among alternative and competing enhanced-oil-recovery (EOR) techniques to be considered for a given reservoir. Our work is motivated by the observation that, even if a considerable variety of EOR techniques was successfully applied to extend oilfield production and lifetime, an EOR project requires extensive laboratory and pilot tests before fieldwide implementation and preliminary assessment of EOR potential in a reservoir is critical in the decision-making process. Because similar EOR techniques may be successful in fields sharing some global features, as basic discrimination criteria, we consider fluid (density and viscosity) and reservoir-formation (porosity, permeability, depth, and temperature) properties. Our approach is observation-driven and grounded on an exhaustive database that we compiled after considering worldwide EOR field experiences. A preliminary reduction of the dimensionality of the parameter space over which EOR projects are classified is accomplished through principal-component analysis (PCA). A screening of target analogs is then obtained by classification of documented EOR projects through a Bayesian-clustering algorithm. Considering the cluster that includes the EOR field under evaluation, an intercluster refinement is then accomplished by ordering cluster components on the basis of a weighted Euclidean distance from the target field in the (multidimensional) parameter space. Distinctive features of our methodology are that (a) all screening analyses are performed on the database projected onto the space of principal components (PCs) and (b) the fraction of variance associated with each PC is taken as weight of the Euclidean distance that we determine. As a test bed, we apply our approach on three fields operated by Eni. These include light-, medium-, and heavy-oil reservoirs, where gas, chemical, and thermal EOR projects were, respectively, proposed. Our results are (a) conducive to the compilation of a broad and extensively usable database of EOR settings and (b) consistent with the field observations related to the three tested and already planned/implemented EOR methodologies, thus demonstrating the effectiveness of our approach.


2018 ◽  
Vol 120 (6) ◽  
pp. 3155-3171 ◽  
Author(s):  
Roland Diggelmann ◽  
Michele Fiscella ◽  
Andreas Hierlemann ◽  
Felix Franke

High-density microelectrode arrays can be used to record extracellular action potentials from hundreds to thousands of neurons simultaneously. Efficient spike sorters must be developed to cope with such large data volumes. Most existing spike sorting methods for single electrodes or small multielectrodes, however, suffer from the “curse of dimensionality” and cannot be directly applied to recordings with hundreds of electrodes. This holds particularly true for the standard reference spike sorting algorithm, principal component analysis-based feature extraction, followed by k-means or expectation maximization clustering, against which most spike sorters are evaluated. We present a spike sorting algorithm that circumvents the dimensionality problem by sorting local groups of electrodes independently with classical spike sorting approaches. It is scalable to any number of recording electrodes and well suited for parallel computing. The combination of data prewhitening before the principal component analysis-based extraction and a parameter-free clustering algorithm obviated the need for parameter adjustments. We evaluated its performance using surrogate data in which we systematically varied spike amplitudes and spike rates and that were generated by inserting template spikes into the voltage traces of real recordings. In a direct comparison, our algorithm could compete with existing state-of-the-art spike sorters in terms of sensitivity and precision, while parameter adjustment or manual cluster curation was not required. NEW & NOTEWORTHY We present an automatic spike sorting algorithm that combines three strategies to scale classical spike sorting techniques for high-density microelectrode arrays: 1) splitting the recording electrodes into small groups and sorting them independently; 2) clustering a subset of spikes and classifying the rest to limit computation time; and 3) prewhitening the spike waveforms to enable the use of parameter-free clustering. Finally, we combined these strategies into an automatic spike sorter that is competitive with state-of-the-art spike sorters.


Author(s):  
Yanwen Wang ◽  
Javad Garjami ◽  
Milena Tsvetkova ◽  
Nguyen Huu Hau ◽  
Kim-Hung Pho

Abstract Data mining, statistics, and data analysis are popular techniques to study datasets and extract knowledge from them. In this article, principal component analysis and factor analysis were applied to cluster thirteen different given arrangements about the Suras of the Holy Quran. The results showed that these thirteen arrangements can be categorized in two parts such that the first part includes Blachère, Davood, Grimm, Nöldeke, Bazargan, E’temad-al-Saltane and Muir, and the second part includes Ebn Nadim, Jaber, Ebn Abbas, Hazrat Ali, Khazan, and Al-Azhar.


Sign in / Sign up

Export Citation Format

Share Document