Visualizing the Complexity of the Athlete-Monitoring Cycle Through Principal-Component Analysis

Purpose: To discuss the use of principal-component analysis (PCA) as a dimension-reduction and visualization tool to assist in decision making and communication when analyzing complex multivariate data sets associated with the training of athletes. Conclusions: Using PCA, it is possible to transform a data matrix into a set of orthogonal composite variables called principal components (PCs), with each PC being a linear weighted combination of the observed variables and with all PCs uncorrelated to each other. The benefit of transforming the data using PCA is that the first few PCs generally capture the majority of the information (ie, variance) contained in the observed data, with the first PC accounting for the highest amount of variance and each subsequent PC capturing less of the total information. Consequently, through PCA, it is possible to visualize complex data sets containing multiple variables on simple 2D scatterplots without any great loss of information, thereby making it much easier to convey complex information to coaches. In the future, athlete-monitoring companies should integrate PCA into their client packages to better support practitioners trying to overcome the challenges associated with multivariate data analysis and interpretation. In the interim, the authors present here an overview of PCA and associated R code to assist practitioners working in the field to integrate PCA into their athlete-monitoring process.

Download Full-text

Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure

Ekológia (Bratislava) ◽

10.1515/eko-2016-0014 ◽

2016 ◽

Vol 35 (2) ◽

pp. 173-190 ◽

Cited By ~ 13

Author(s):

S. Shahid Shaukat ◽

Toqeer Ahmed Rao ◽

Moazzam A. Khan

Keyword(s):

Principal Component Analysis ◽

Sample Size ◽

Principal Component ◽

Component Analysis ◽

Small Sample ◽

Environmental Data ◽

Data Matrix ◽

Data Sets ◽

Data Set ◽

The Impact

AbstractIn this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22) of a small data set comprising of 55 samples (stations from where water samples were collected). Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions.

Download Full-text

Principal component analysis of substituent constants

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc19900055 ◽

1990 ◽

Vol 55 (1) ◽

pp. 55-62 ◽

Cited By ~ 1

Author(s):

Drahomír Hnyk

Keyword(s):

Principal Component Analysis ◽

Inductive Effect ◽

Resonance Effect ◽

Principal Component ◽

Component Analysis ◽

Total Variance ◽

Data Matrix ◽

Cumulative Proportion ◽

Two Factors ◽

Substituent Constants

The principal component analysis has been applied to a data matrix formed by 7 usual substituent constants for 38 substituents. Three factors are able to explain 99.4% cumulative proportion of total variance. Several rotations have been carried out for the first two factors in order to obtain their physical meaning. The first factor is related to the resonance effect, whereas the second one expresses the inductive effect, and both together describe 97.5% cumulative proportion of total variance. Their mutual orthogonality does not directly follow from the rotations carried out. With the help of these factors the substituents are divided into four main classes, and some of them assume a special position.

Download Full-text

Principal Component Analysis of Hydrological Data

Handbook of Research on Hydroinformatics ◽

10.4018/978-1-61520-907-1.ch018 ◽

2010 ◽

pp. 364-388

Author(s):

Petr Praus

Keyword(s):

Water Quality ◽

Principal Component Analysis ◽

Drinking Water ◽

Ground Water ◽

Principal Component ◽

Component Analysis ◽

Data Sets ◽

Hydrological Data ◽

First Case

In this chapter the principals and applications of principal component analysis (PCA) applied on hydrological data are presented. Four case studies showed the possibility of PCA to obtain information about wastewater treatment process, drinking water quality in a city network and to find similarities in the data sets of ground water quality results and water-related images. In the first case study, the composition of raw and cleaned wastewater was characterised and its temporal changes were displayed. In the second case study, drinking water samples were divided into clusters in consistency with their sampling localities. In the case study III, the similar samples of ground water were recognised by the calculation of cosine similarity, the Euclidean and Manhattan distances. In the case study IV, 32 water-related images were transformed into a large image matrix whose dimensionality was reduced by PCA. The images were clustered using the PCA scatter plots.

Download Full-text

A Study of Effectiveness of Principal Component Analysis on Different Data Sets

2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) ◽

10.1109/iccic.2017.8524329 ◽

2017 ◽

Author(s):

Mukti Krishnan ◽

Dipankar Dutta

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Data Sets

Download Full-text

SVD-based principal component analysis of geochemical data

Open Chemistry ◽

10.2478/bf02475200 ◽

2005 ◽

Vol 3 (4) ◽

pp. 731-741 ◽

Cited By ~ 11

Author(s):

Petr Praus

Keyword(s):

Principal Component Analysis ◽

Euclidean Distance ◽

Principal Component ◽

Component Analysis ◽

Data Matrix ◽

Geochemical Data ◽

Average Group ◽

Testing Data ◽

Value Decomposition ◽

Geochemical Variables

AbstractPrincipal Component Analysis (PCA) was used for the mapping of geochemical data. A testing data matrix was prepared from the chemical and physical analyses of the coals altered by thermal and oxidation effects. PCA based on Singular Value Decomposition (SVD) of the standardized (centered and scaled by the standard deviation) data matrix revealed three principal components explaining 85.2% of the variance. Combining the scatter and components weights plots with knowledge of the composition of tested samples, the coal samples were divided into seven groups depending on the degree of their oxidation and thermal alteration.The PCA findings were verified by other multivariate methods. The relationships among geochemical variables were successfully confirmed by Factor Analysis (FA). The data structure was also described by the Average Group dendrogram using Euclidean distance. The found sample clusters were not defined so clearly as in the case of PCA. It can be explained by the PCA filtration of the data noise.

Download Full-text

ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap

Nucleic Acids Research ◽

10.1093/nar/gkv468 ◽

2015 ◽

Vol 43 (W1) ◽

pp. W566-W570 ◽

Cited By ~ 924

Author(s):

Tauno Metsalu ◽

Jaak Vilo

Keyword(s):

Principal Component Analysis ◽

Multivariate Data ◽

Principal Component ◽

Component Analysis ◽

Web Tool

Download Full-text

Entwicklung, Beobachtung und Steuerung integrierter, quasi-kontinuierlicher pharmazeutischer Produktionsprozesse mit Methoden der Multivariaten Datenverarbeitung

10.51202/9783186292179 ◽

2016 ◽

Author(s):

Sven-Oliver Borchert

Keyword(s):

Principal Component Analysis ◽

Data Analysis ◽

Multivariate Data Analysis ◽

Process Analytical Technology ◽

Multivariate Data ◽

Principal Component ◽

Component Analysis ◽

Analytical Technology

Die vorliegende Arbeit befasst sich mit Aspekten einer modernen Bioverfahrenstechnik am Beispiel von Prozessen zur Herstellung rekombinanter potentieller Malariavakzine. Dabei wurden zwei quasi-kontinuierliche Prozesse aus herkömmlichen Batch-Unit Operationen aufgebaut, in denen die Anwendung von Process Analytical Technology im Vordergrund steht. Das Hauptaugenmerk dieser Arbeit lag dabei auf einer Implementierung der Multivariate Data Analysis zum Monitoring und zur Evaluierung des zyklischen Prozessablaufes und seiner Reproduzierbarkeit. Im Bereich der Principal Component Analysis wurde die Methode der Prozessüberwachung mit dem Golden Batch-Tunnel angewendet. Mit dem Golden Batch-Ansatz wurden Methoden zur Prozessprädiktion implementiert und mit einer Model Predictive Multivariate Control auch zur Steuerung von realen Prozesses erprobt. Darüber hinaus wurde die MVDA zur Prädiktion von Medienkomponenten sowie deren zellspezifische Reaktionsraten aus klassischen Onli...

Download Full-text

EVALUATION AND MODELLING OF GROUND WATER QUALITY DATA OF ALLAHABAD CITY BY ENVIRONMETRIC METHODS

Green Chemistry & Technology Letters ◽

10.18510/gctl.2016.248 ◽

2016 ◽

Vol 2 (4) ◽

pp. 211

Author(s):

Girdhari Lal Chaurasia ◽

Mahesh Kumar Gupta ◽

Praveen Kumar Tandon

Keyword(s):

Water Quality ◽

Principal Component Analysis ◽

Cluster Analysis ◽

Factor Analysis ◽

Principal Component ◽

Component Analysis ◽

Sampling Location ◽

Data Sets ◽

Multivariate Statistical ◽

Positive Loading

Water is an essential resource for all the organisms, plants and animals including the human beings. It is the backbone for agricultural and industrial sectors and all the small business units. Increase in human population and economic activities have tremendously increased the demand for large-scale suppliers of fresh water for various competing end users.The quality evaluation of water is represented in terms of physical, chemical and Biological parameters. A particular problem in the case of water quality monitoring is the complexity associated with analyzing the large number of measured variables. The data sets contain rich information about the behavior of the water resources. Multivariate statistical approaches allow deriving hidden information from the data sets about the possible influences of the environment on water quality. Classification, modeling and interpretation of monitored data are the most important steps in the assessment of water quality. The application of different multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA) and factor analysis (FA) help to identify important components or factors accounting for most of the variances of a system. In the present study water samples were analyzed for various physicochemical analyses by different methods following the standards of APHA, BIS and WHO and were subjected to further statistical analysis viz. the cluster analysis to understand the similarity and differences among the various sampling stations. Three clusters were found. Cluster 1 was marked with 3 sampling locations 1, 3 & 5; Cluster-2 was marked with sampling location-2 and cluster-3 was marked with sampling location-4. Principal component analysis/factor analysis is a pattern reorganization technique which is used to assess the correlation between the observations in terms of different factors which are not observable. Observations correlated either positively or negatively, are likely to be affected by the same factors while the observations which are not correlated are influenced by different factors. In our study three factors explained 99.827% of variances. F1 marked 51.619% of total variances, high positive strong loading with TSS, TS, Temp, TDS, phosphate and moderate with electrical conductivity with loading values of 0.986, 0.970, 0.792, 0.744, 0.695, 0.701, respectively. Factor 2 marked 27.236% of the total variance with moderate positive loading with total alkalinity & temp. with loading values 0.723 & 0.606 respectively. It also explained the moderate negative loading with conductivity, TDS, and chloride with loading values -0.698, -0.690, -0.582. Factor F 3 marked 20.972 % of the variances with positive loading with PH, chloride, and phosphate with strong loading of pH 0.872 and moderate positive loading with chloride and phosphate with loading values 0.721, and 0.569 respectively.

Download Full-text

Identification of Excitation Source Number Using Principal Component Analysis

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.199-200.850 ◽

2011 ◽

Vol 199-200 ◽

pp. 850-857

Author(s):

Jian Chao Dong ◽

Tie Jun Yang ◽

Xin Hui Li ◽

Zhi Jun Shuai ◽

You Hong Xiao

Keyword(s):

Principal Component Analysis ◽

Signal To Noise Ratio ◽

Multiple Input Multiple Output ◽

Principal Component ◽

Relevant Information ◽

Component Analysis ◽

Data Sets ◽

Blind Signal ◽

Input Multiple Output ◽

Source Number

Principal component analysis (PCA), serving as one of the basic blind signal processing techniques, is extensively employed in all forms of analysis for extracting relevant information from confusing data sets. The principle of PCA is explained in this paper firstly, then the simulation and experiment are carried out to a simply supported beam rig, and PCA is used in frequency domain to identify sources number of several cases. Meanwhile principal components (PCs) contribution coefficient and signal to noise ratio between neighboring PCs (neighboring SNR) are introduced to cutoff minor components quantificationally. The results show that when observation number is equal to or larger than source number and additive noise is feebleness, accurate prediction of the number of uncorrelated excitation sources in a multiple input multiple output system could be obtained by principal component analysis.

Download Full-text

Functional Connectivity: The Principal-Component Analysis of Large (PET) Data Sets

Journal of Cerebral Blood Flow & Metabolism ◽

10.1038/jcbfm.1993.4 ◽

1993 ◽

Vol 13 (1) ◽

pp. 5-14 ◽

Cited By ~ 1182

Author(s):

K. J. Friston ◽

C. D. Frith ◽

P. F. Liddle ◽

R. S. J. Frackowiak

Keyword(s):

Principal Component Analysis ◽

Functional Connectivity ◽

Verbal Fluency ◽

Principal Component ◽

Temporal Correlation ◽

Large Data ◽

Component Analysis ◽

Anterior Cingulate ◽

Data Sets ◽

Brain System

The distributed brain systems associated with performance of a verbal fluency task were identified in a nondirected correlational analysis of neurophysiological data obtained with positron tomography. This analysis used a recursive principal-component analysis developed specifically for large data sets. This analysis is interpreted in terms of functional connectivity, defined as the temporal correlation of a neurophysiological index measured in different brain areas. The results suggest that the variance in neurophysiological measurements, introduced experimentally, was accounted for by two independent principal components. The first, and considerably larger, highlighted an intentional brain system seen in previous studies of verbal fluency. The second identified a distributed brain system including the anterior cingulate and Wernicke's area that reflected monotonic time effects. We propose that this system has an attentional bias.

Download Full-text