Principal component analysis based on non-parametric maximum entropy

2010 ◽  
Vol 73 (10-12) ◽  
pp. 1840-1852 ◽  
Author(s):  
Ran He ◽  
Baogang Hu ◽  
XiaoTong Yuan ◽  
Wei-Shi Zheng
2018 ◽  
Vol 3 (1) ◽  
Author(s):  
Husaini Husaini ◽  
Huzaeni Huzaeni ◽  
Fahmi Fahmi

Abstrak — Principal Component Analysis (PCA) merupakan salah satu teknik yang ada dalam statistic dan merupakan metode non parametric untuk mengekstraksi informasi-informasi yang bersesuaian dari sekumpulan data yang masih diragukan dan memerlukan proses untuk menghilangkan gangguan-gangguan yang ada. Data yang dimaksud salah satunya adalah sinyal ektrokardiogram (EKG). Sinyal EKG merupakan sinyal yang diperoleh dari rekaman aktifitas elektrik dari jantung. Rekaman sinyal EKG tidak saja digunakan untuk tujuan diagnosa, tapi juga disimpan sebagai referensi dalam mengklasifikasi EKG arrhythmia. Untuk mendapatkan hasil yang lebih baik maka data-data sinyal EKG akan direduksi dimensinya dengan tujuan untuk menghilangkab data-data yang tidak sesuai, tidak relevan dan data redundant sehingga dapat menghemat biaya komputasinya dan mencegah data-data yang over-fitting. Tulisan ini memaparkan tentang ide dasar dari PCA dalam mereduksi dimensi data-data dari sinyal  EKG. Hasil yang ditampilkan adalah berupa proses-proses dalam algoritma PCA dan akurasi klasifikasi sinyal  dengan metode KNN dan Naive Bayes.Kata kunci : principal component analysisi (PCA), sinyal EKG, reduksi dimensi Abstract — The Principal Component Analysis (PCA) is one of the existing techniques in statistics and a non parametric method for extracting the information from a collection of data that still in doubt and requires a process to remove any disturbances. The data in question one of them is the signal ektrokardiogram (ECG). ECG signals are signals obtained from recording electrical activity from the heart. ECG signal recording is not only used for diagnostic purposes, but is also stored as a reference in classifying ECG arrhythmias. To get better results then the ECG signal data will be reduced the dimension. The aim to removed data that are not appropriate, irrelevant and redundant data so as to save the cost of computing and prevent data over-fitting. This paper describes the basic idea of PCA in reducing the dimensions of data from ECG signals. The results shown are the processes in PCA algorithm and signal classification accuracy by KNN and Naive Bayes methods.Keywords— Principal Component Analysis, ECG Signal, reduction dimentionality


Blood ◽  
2009 ◽  
Vol 114 (22) ◽  
pp. 3802-3802
Author(s):  
Ester Mejstrikova ◽  
Vendula Pelkova ◽  
Michaela Reiterová ◽  
Martina Sukova ◽  
Zuzana Zemanova ◽  
...  

Abstract Abstract 3802 Poster Board III-738 Introduction Monosomy 7 or del(7q) are frequent cytogenetic abnormalities in children with myelodysplastic syndrome (MDS) and associates with poor prognosis. MDS globally affects all cellular subsets in bone marrow and in peripheral blood. We asked whether flow cytometry (FC) can separate individual subtypes of MDS from each other and from aplastic anemia (SAA) and whether in individual subtypes of childhood MDS can separate patients with and without monosomy 7. Patients/analyzed parameters In total we analyzed 94 children with centrally analyzed immunophenotype in the reference lab who were diagnosed and treated for MDS or SAA between 1998 and 2009. In total we analyzed 14 patients with refractory cytopenia, 37 patients with advanced forms of MDS (JMML 10, RAEB 25, CMML 2) and 43 patients with SAA. Monosomy 7/del(7q) was present in 17 patients (RC 6, JMML 3, RAEB 8). Analyzed parameters were as follows: B cells, CD10+CD19+, CD19+45dim/neg, CD19+34+, CD19/CD34 ratio, CD34+, CD117 cells, CD34+38dim/neg, CD3+, CD3+4+, CD3+8+, CD3+HLADR+. Statistics We analyzed all parameters using non parametric tests (Mann-Whitney, Kruskal Wallis) and principal component analysis (PCA). Results Principal component analysis of all analyzed patients together clearly separates advanced forms of MDS from RC and SAA, the most contributing factor being the number of CD34 and CD117+ cells. In non parametric statistics following factors significantly differ among MDS subtypes and SAA (Kruskal-Wallis): CD19, CD117, CD34, CD3, CD3+4+, CD8+ and CD3+HLADR+. RC and SAA patients are separated mainly by the number of B cells and the CD34:CD19 ratio. In addition, the following parameters differ between RC and SAA (Mann-Whitney): CD34, CD117 and CD3+HLADR+. Unlike the CD34:CD19 ratio, the number of CD19+34+ precursors does not differ between RC and SAA patients. Patients with monosomy 7 do not differ from the remaining patients when all MDS patients are analyzed together or separately in the respective subgroups (RC, non RC, JMML) by PCA or by non parametric statistics. Conclusion PCA separates advanced MDS forms from RC and SAA. Advanced forms of MDS are characterized by increased percentage of CD34+ and CD117+ cells compared to RC and SAA patients. The global reduction of B cell progenitor compartment is pronounced especially in non-JMML cases of MDS, whereas SAA patients typically present with isolated reduction of cells at early stages (CD19+34+) of B cell development. Patients with monosomy 7 cluster within the respective disease category, they do not form own cluster in PCA. Supported by MSMT VZ MSM0021620813, MZO 00064203 VZ FNM, MZO VFN2005, IGA NR/9531-3, NPV 2B06064. Disclosures: No relevant conflicts of interest to declare.


Author(s):  
Katarzyna A. Kurek ◽  
Wim Heijman ◽  
Johan van Ophem ◽  
Stanisław Gędek ◽  
Jacek Strojny

AbstractThis article discusses two methods to measure the concept of local competitiveness: Principal Component Analysis (PCA) and Analytical Hierarchy Process (AHP). The goal of this analysis is to determine whether these two methods used in social sciences research lead to comparable model results. By non-parametric tests we show that there is a significant correlation between the PCA and AHP local competitiveness indexes. Thereafter, a developed mixed method examination of whether the methods can be used interchangeably is presented and illustrated with detailed examples of two mixed approaches. The mixed method confirms the correlation between the PCA and AHP models. However, the mixed modelling results indicate the utility of the PCA in the situation of a multicriteria local competitiveness data examination.


Author(s):  
J. Tourenq ◽  
V. Rohrlich

Correspondence analysis, a non-parametric principal component analysis, has been used to analyze heavy mineral data so that variations between both samples and minerals can be studied simultaneously. Four data sets were selected to demonstrate the method. The first example, modern sediments from the River Nile, illustrates how correspondence analysis brings out extra details in heavy mineral associations. The other examples come from the Plio-Quaternary "Bourbonnais Formation" of the French Massif Central. The first data set demonstrates how the principal factor plane (with axes 1 and 2) highlights relationships between geographical position and the predominant heavy mineral association (metamorphic minerals and zircon), suggesting the paleogeographic source. In the second set, the factor plane of axes 1 and 3 indicates a subdivision of the metamorphic mineral assemblage, suggesting two sources of metamorphic minerals. Finally, outcrop samples were projected onto the factor plane and reveal ancient drainage systems important for the accumulation of the Bourbonnais sands. Statistical methods used in interpreting heavy minerals in sediments range from simple and classical methods, such as calculation of means and standard deviations, to the calculation of correspondences and variances. Use of multivariate methods is increasingly frequent (Maurer, 1983; Stattegger, 1986; 1987; Delaune et al., 1989; Mezzadri and Saccani, 1989) since the first studies of Imbrie and vanAndel (1964). Ordination techniques such as principal component analysis (Harman, 1961) synthesize large amounts of data and extract the most important relationships. We have chosen a non-parametric form of principal component analysis called correspondence analysis. This technique has been used in sedimentology by Chenet and Teil (1979) to investigate deep-sea samples, by Cojan and Teil (1982) and Mercier et al. (1987) to define paleoenvironments, and by Cojan and Beaudoin (1986) to show paleoecological control of deposition in French sedimentary basins. Correspondence analysis has been used successfully to interpret heavy mineral data (Tourenq et al, 1978a, 1978b; Bolin et al, 1982; Tourenq, 1986, 1989; Faulp et al, 1988; Ambroise et al, 1987). We provide examples of different situations where the method can be applied. We will not present the mathematical and statistical procedures involved in correspondence analysis, but refer readers to Benzécri et al.


2015 ◽  
Vol 712 ◽  
pp. 101-106
Author(s):  
Ewa Skrzypczak-Pietraszek ◽  
Jacek Pietraszek

The large dimensionality and unknown distributions are often met in a plant biotechnology and phytochemistry investigations. In this paper two methods are presented: principal component analysis allowing to reduce dimensionality and non-parametric Kruskal-Wallis ANOVA allowing to separate factors’ influence even if the distribution is unknown. The paper contains: problem definition, presentation of the measured data and the final analysis. The paper should be potentially useful to other industrial or research approaches.


VASA ◽  
2012 ◽  
Vol 41 (5) ◽  
pp. 333-342 ◽  
Author(s):  
Kirchberger ◽  
Finger ◽  
Müller-Bühl

Background: The Intermittent Claudication Questionnaire (ICQ) is a short questionnaire for the assessment of health-related quality of life (HRQOL) in patients with intermittent claudication (IC). The objective of this study was to translate the ICQ into German and to investigate the psychometric properties of the German ICQ version in patients with IC. Patients and methods: The original English version was translated using a forward-backward method. The resulting German version was reviewed by the author of the original version and an experienced clinician. Finally, it was tested for clarity with 5 German patients with IC. A sample of 81 patients were administered the German ICQ. The sample consisted of 58.0 % male patients with a median age of 71 years and a median IC duration of 36 months. Test of feasibility included completeness of questionnaires, completion time, and ratings of clarity, length and relevance. Reliability was assessed through a retest in 13 patients at 14 days, and analysis of Cronbach’s alpha for internal consistency. Construct validity was investigated using principal component analysis. Concurrent validity was assessed by correlating the ICQ scores with the Short Form 36 Health Survey (SF-36) as well as clinical measures. Results: The ICQ was completely filled in by 73 subjects (90.1 %) with an average completion time of 6.3 minutes. Cronbach’s alpha coefficient reached 0.75. Intra-class correlation for test-retest reliability was r = 0.88. Principal component analysis resulted in a 3 factor solution. The first factor explained 51.5 of the total variation and all items had loadings of at least 0.65 on it. The ICQ was significantly associated with the SF-36 and treadmill-walking distances whereas no association was found for resting ABPI. Conclusions: The German version of the ICQ demonstrated good feasibility, satisfactory reliability and good validity. Responsiveness should be investigated in further validation studies.


Sign in / Sign up

Export Citation Format

Share Document