scholarly journals Finding Structures of Interest in a Large Data Set Using Factor Analysis

2016 ◽  
Vol 26 (2) ◽  
Author(s):  
Peter Filzmoser

In this paper we introduce a statistical method which can be used in combination with principal component analysis or factor analysis. Certain variables of a large data set which are of interest can be selected in order to calculate loadings and scores of these variables. We describe how the remaining variables of the data set can be presented in the previously extracted factor space. Furthermore, a possibility for the representation of the results is shown which is helpful for the interpretation.

2020 ◽  
Vol 13 (2) ◽  
pp. 11
Author(s):  
Bekti Endar Susilowati ◽  
Pardomuan Robinson Sihombing

Principal Component Analysis (PCA) merupakan salah satu analisis multivariat yang digunakan untuk mengganti variable dengan Principal Component yang sedikit jumlahnya namun tidak terlalu banyak informasi yang hilang. Atau dengan kata lain, it used to explain the underlying variance-covariance structure of the large data set of variables through a few linear combination of these variables. PCA sangat dipengaruhi oleh kehadiran outlier karena didasarkan pada matriks kovarian yang sensitive terhadap outlier. Oleh karena itu, pada analisis ini akan digunakan PCA yang robust terhadap outlier yaitu ROBPCA atau PCA Hubert. Selanjutnya, dari Principal Component yang terbentuk digunakan sebagai input (masukan) untuk cluster analysis dengan metode Clara (Clustering Large Area). Clustering Large Area merupakan salah satu metode k-medoids yang robust terhadap outlier dan baik digunakan pada data dalam jumlah besar. Dalam studi kasus terhadap variabel penyusun indeks kebahagiaan berdasarkan The World Happiness Report 2018 dengan metode Clara yang menggunakan jarak manhattan didapatkan nilai rata-rata Overall Average Silhouette Width yang terbaik pada 5 cluster. 


Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2067
Author(s):  
Viliam Ďuriš ◽  
Renáta Bartková ◽  
Anna Tirpáková

The present contribution is devoted to the theory of fuzzy sets, especially Atanassov Intuitionistic Fuzzy sets (IF sets) and their use in practice. We define the correlation between IF sets and the correlation coefficient, and we bring a new perspective to solving the problem of data file reduction in case sets where the input data come from IF sets. We present specific applications of the two best-known methods, the Principal Component Analysis and Factor Analysis, used to solve the problem of reducing the size of a data file. We examine input data from IF sets from three perspectives: through membership function, non-membership function and hesitation margin. This examination better reflects the character of the input data and also better captures and preserves the information that the input data carries. In the article, we also present and solve a specific example from practice where we show the behavior of these methods on data from IF sets. The example is solved using R programming language, which is useful for statistical analysis of data and their graphical representation.


Author(s):  
Lachlan P. James ◽  
Haresh Suppiah ◽  
Michael R. McGuigan ◽  
David L. Carey

Purpose: Dozens of variables can be derived from the countermovement jump (CMJ). However, this does not guarantee an increase in useful information because many of the variables are highly correlated. Furthermore, practitioners should seek to find the simplest solution to performance testing and reporting challenges. The purpose of this investigation was to show how to apply dimensionality reduction to CMJ data with a view to offer practitioners solutions to aid applications in high-performance settings. Methods: The data were collected from 3 cohorts using 3 different devices. Dimensionality reduction was undertaken on the extracted variables by way of principal component analysis and maximum likelihood factor analysis. Results: Over 90% of the variance in each CMJ data set could be explained in 3 or 4 principal components. Similarly, 2 to 3 factors could successfully explain the CMJ. Conclusions: The application of dimensional reduction through principal component analysis and factor analysis allowed for the identification of key variables that strongly contributed to distinct aspects of jump performance. Practitioners and scientists can consider the information derived from these procedures in several ways to streamline the transfer of CMJ test information.


2017 ◽  
Vol 727 ◽  
pp. 447-449 ◽  
Author(s):  
Jun Dai ◽  
Hua Yan ◽  
Jian Jian Yang ◽  
Jun Jun Guo

To evaluate the aging behavior of high density polyethylene (HDPE) under an artificial accelerated environment, principal component analysis (PCA) was used to establish a non-dimensional expression Z from a data set of multiple degradation parameters of HDPE. In this study, HDPE samples were exposed to the accelerated thermal oxidative environment for different time intervals up to 64 days. The results showed that the combined evaluating parameter Z was characterized by three-stage changes. The combined evaluating parameter Z increased quickly in the first 16 days of exposure and then leveled off. After 40 days, it began to increase again. Among the 10 degradation parameters, branching degree, carbonyl index and hydroxyl index are strongly associated. The tensile modulus is highly correlated with the impact strength. The tensile strength, tensile modulus and impact strength are negatively correlated with the crystallinity.


Sign in / Sign up

Export Citation Format

Share Document