PCAGO: An interactive web service to analyze RNA-Seq data with principal component analysis

Mapping Intimacies ◽

10.1101/433078 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ruman Gerst ◽

Martin Hölzer

Keyword(s):

Principal Component Analysis ◽

Web Service ◽

Principal Components ◽

Clustering Algorithm ◽

Gene Annotation ◽

Principal Component ◽

Component Analysis ◽

Rna Seq ◽

Gene Sets ◽

Relationship Of

ABSTRACTThe initial characterization and clustering of biological samples is a critical step in the analysis of any transcriptomic study. In many studies, principal component analysis (PCA) is the clustering algorithm of choice to predict the relationship of samples or cells based solely on differential gene expression. In addition to the pure quality evaluation of the data, a PCA can also provide initial insights into the biological background of an experiment and help researchers to interpret the data and design the subsequent computational steps accordingly. However, to avoid misleading clusterings and interpretations, an appropriate selection of the underlying gene sets to build the PCA and the choice of the most fitting principal components for the visualization are crucial parts. Here, we present PCAGO, an easy-to-use and interactive web service to analyze gene quantification data derived from RNA sequencing (RNA-Seq) experiments with PCA. The tool includes features such as read-count normalization, filtering of read counts by gene annotation, and various visualization options. Additionally, PCAGO helps to select appropriate parameters such as the number of genes and principal components to create meaningful visualizations.Availability and implementationThe web service is implemented in R and freely available at [email protected]

Download Full-text

Stormwater inflow prediction using radar rainfall data compressed by principal component analysis

Water Practice & Technology ◽

10.2166/wpt.2006.017 ◽

2006 ◽

Vol 1 (1) ◽

Author(s):

K. Katayama ◽

K. Kimijima ◽

O. Yamanaka ◽

A. Nagaiwa ◽

Y. Ono

Keyword(s):

Principal Component Analysis ◽

Prediction Model ◽

Principal Components ◽

Prediction Method ◽

Principal Component ◽

Component Analysis ◽

Rainfall Data ◽

Radar Rainfall ◽

Input Variables ◽

Inflow Prediction

This paper proposes a method of stormwater inflow prediction using radar rainfall data as the input of the prediction model constructed by system identification. The aim of the proposal is to construct a compact system by reducing the dimension of the input data. In this paper, Principal Component Analysis (PCA), which is widely used as a statistical method for data analysis and compression, is applied to pre-processing radar rainfall data. Then we evaluate the proposed method using the radar rainfall data and the inflow data acquired in a certain combined sewer system. This study reveals that a few principal components of radar rainfall data can be appropriate as the input variables to storm water inflow prediction model. Consequently, we have established a procedure for the stormwater prediction method using a few principal components of radar rainfall data.

Download Full-text

The Design of Index about Corporate Governance Based on PCA Method

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.926-930.4085 ◽

2014 ◽

Vol 926-930 ◽

pp. 4085-4088

Author(s):

Chuan Jun Li

Keyword(s):

Principal Component Analysis ◽

Corporate Governance ◽

Principal Components ◽

Principal Component ◽

Component Analysis ◽

Contribution Rate ◽

Variance Contribution ◽

Pca Method

This article uses the PCA method (Principal component analysis) to evaluate the level of corporate governance. PCA is used to analyze the correlation among 10 original indicators, and extract some principal components so that most of the information of the original indicators is extracted. The formulation of the index of corporate governance can be got by calculating the weight based on the variance contribution rate of the principal component, which can comprehensively evaluate corporate governance.

Download Full-text

Identification of Liquor Brands Based on near Infrared Spectroscopy

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.834-836.935 ◽

2013 ◽

Vol 834-836 ◽

pp. 935-938

Author(s):

Lian Shun Zhang ◽

Chao Guo ◽

Bao Quan Wang

Keyword(s):

Principal Component Analysis ◽

Infrared Spectroscopy ◽

Near Infrared Spectroscopy ◽

Principal Components ◽

Near Infrared ◽

Scatter Correction ◽

Principal Component ◽

Component Analysis ◽

Correction Method ◽

Variance Contribution

In this paper, the liquor brands were identified based on the near infrared spectroscopy method and the principal component analysis. 60 samples of 6 different brands liquor were measured by the spectrometer of USB4000. Then, in order to eliminate the noise caused by the external factors, the smoothing method and the multiplicative scatter correction method were used. After the preprocessing, we got the revised spectra of the 60 samples. The difference of the spectrum shape of different brands is not much enough to classify them. So the principal component analysis was applied for further analysis. The results showed that the first two principal components variance contribution rate had reached 99.06%, which can effectively represent the information of the spectrums after preprocessing. From the scatter plot of the two principal components, the 6 different brands of liquor were identified more accurate and easier than the spectra curves.

Download Full-text

Grouping sunflower genotypes for yield, oil content, and reaction to Alternaria leaf spot using GGE biplot

Pesquisa Agropecuária Brasileira ◽

10.1590/s0100-204x2015000800003 ◽

2015 ◽

Vol 50 (8) ◽

pp. 649-657 ◽

Cited By ~ 2

Author(s):

Regina Maria Villas Bôas de Campos Leite ◽

Maria Cristina Neves de Oliveira

Keyword(s):

Principal Component Analysis ◽

Disease Severity ◽

Principal Components ◽

Leaf Spot ◽

Oil Content ◽

Principal Component ◽

Component Analysis ◽

High Yield ◽

Gge Biplot ◽

Alternaria Leaf Spot

Abstract:The objective of this work was to evaluate the suitability of the multivariate method of principal component analysis (PCA) using the GGE biplot software for grouping sunflower genotypes for their reaction to Alternaria leaf spot disease (Alternariaster helianthi), and for their yield and oil content. Sixty-nine genotypes were evaluated for disease severity in the field, at the R3 growth stage, in seven growing seasons, in Londrina, in the state of Paraná, Brazil, using a diagrammatic scale developed for this disease. Yield and oil content were also evaluated. Data were standardized using the software Statistica, and GGE biplot was used for PCA and graphical display of data. The first two principal components explained 77.9% of the total variation. According to the polygonal biplot using the first two principal components and three response variables, the genotypes were divided into seven sectors. Genotypes located on sectors 1 and 2 showed high yield and high oil content, respectively, and those located on sector 7 showed tolerance to the disease and high yield, despite the high disease severity. The principal component analysis using GGE biplot is an efficient method for grouping sunflower genotypes based on the studied variables.

Download Full-text

Principal component analysis for evaluating a ranking method used in the performance testing in sheep of Morada Nova breed

Semina Ciências Agrárias ◽

10.5433/1679-0359.2015v36n6p3909 ◽

2015 ◽

Vol 36 (6) ◽

pp. 3909

Author(s):

Michelle Santos da Silva ◽

Luciana Shiotsuki ◽

Raimundo Nonato Braga Lôbo ◽

Olivardo Facó

Keyword(s):

Principal Component Analysis ◽

Principal Components ◽

Correlation Coefficients ◽

Performance Testing ◽

Principal Component ◽

Component Analysis ◽

Ranking Method ◽

Daily Weight Gain ◽

Meat Production ◽

Body Depth

A multivariate approach was adopted to evaluate the relationship among traits measured in the performance testing of Morada Nova sheep, verify the efficiency of a ranking method used in these tests and identify the most significant traits for use in future analyses. Data from 150 young rams participating in five versions of the performance tests for the Morada Nova breed were used. Twenty traits were measured in each animal: initial weight (IW), final weight (FW), average daily weight gain (ADG), loin eye area (LEA), scrotal circumference (SC), fat thickness (FT), conformation (C), precocity (Pc), muscularity (M), breed features (BF), legs (L), withers height (WH), chest width (CW), rump height (RH), rump width (RW), rump length (RL), body length (BL), body depth (BD), heart girth (HG) and body condition scoring (BCS). The Pearson’s correlation coefficients ranged from –0.10 to 0.93, with the highest correlations were between body weight variables and morphometric measurements. The three first principal components explained 72.28% of the total variability among all traits. The variables related to animal size defined the first principal component, whereas those related to visual appraisal and suitability for meat production defined the second and third principal components, respectively. The combination of traits from the principal component analysis showed that the ranking method currently used in the performance testing of Morada Nova sheep is efficient for selecting larger rams with better breed features and higher degrees of specialization for meat production.

Download Full-text

Measuring the Systematic Risk of Sectors within the US Market Via Principal Components Analysis: Before and during the COVID-19 Pandemic

10.5772/intechopen.101860 ◽

2022 ◽

Author(s):

Jaime González Maiz Jiménez ◽

Adán Reyes Santiago

Keyword(s):

Principal Component Analysis ◽

Stock Market ◽

Principal Components Analysis ◽

Principal Components ◽

Systematic Risk ◽

Principal Component ◽

Component Analysis ◽

Market Capitalization ◽

The Us ◽

Components Analysis

This research measures the systematic risk of 10 sectors in the American Stock Market, discerning the COVID-19 pandemic period. The novelty of this study is the use of the Principal Component Analysis (PCA) technique to measure the systematic risk of each sector, selecting five stocks per sector with the greatest market capitalization. The results show that the sectors that have the greatest increase in exposure to systematic risk during the pandemic are restaurants, clothing, and insurance, whereas the sectors that show the greatest decrease in terms of exposure to systematic risk are automakers and tobacco. Due to the results of this study, it seems advisable for practitioners to select stocks that belong to either the automakers or tobacco sector to get protection from health crises, such as COVID-19.

Download Full-text

Competitive and Recreational Running Kinematics Examined Using Principal Components Analysis

Healthcare ◽

10.3390/healthcare9101321 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1321

Author(s):

Wenjing Quan ◽

Huiyu Zhou ◽

Datao Xu ◽

Shudong Li ◽

Julien S. Baker ◽

...

Keyword(s):

Principal Component Analysis ◽

Principal Components ◽

Sagittal Plane ◽

Three Dimensional ◽

Frontal Plane ◽

Principal Component ◽

Component Analysis ◽

Ankle Inversion ◽

Motion System ◽

Recreational Runners

Kinematics data are primary biomechanical parameters. A principal component analysis (PCA) of waveforms is a statistical approach used to explore patterns of variability in biomechanical curve datasets. Differences in experienced and recreational runners’ kinematic variables are still unclear. The purpose of the present study was to compare any differences in kinematics parameters for competitive runners and recreational runners using principal component analysis in the sagittal plane, frontal plane and transverse plane. Forty male runners were divided into two groups: twenty competitive runners and twenty recreational runners. A Vicon Motion System (Vicon Metrics Ltd., Oxford, UK) captured three-dimensional kinematics data during running at 3.3 m/s. The principal component analysis was used to determine the dominating variation in this model. Then, the principal component scores retained the first three principal components and were analyzed using independent t-tests. The recreational runners were found to have a smaller dorsiflexion angle, initial dorsiflexion contact angle, ankle inversion, knee adduction, range motion in the frontal knee plane and hip frontal plane. The running kinematics data were influenced by running experience. The findings from the study provide a better understanding of the kinematics variables for competitive and recreational runners. Thus, these findings might have implications for reducing running injury and improving running performance.

Download Full-text

Multiple Process Capability Improvement Method based on Principal Component Analysis

Mechanical Engineering Science ◽

10.33142/me.v1i1.654 ◽

2019 ◽

Vol 1 (1) ◽

Author(s):

Guangqi Ying ◽

Yan Ran ◽

Genbao Zhang ◽

Yuxin Liu ◽

Shengyong Zhang

Keyword(s):

Principal Component Analysis ◽

Principal Components ◽

Evaluation Method ◽

Process Capability ◽

Principal Component ◽

Component Analysis ◽

Multiple Process ◽

Improvement Method ◽

Capability Evaluation ◽

Capability Improvement

For the traditional multi-process capability construction method based on principal component analysis, the process variables are mainly considered, but not the process capability, which leads to the deviation of the contribution rate of principal component. In response to the question, this paper first clarifies the problem from two aspects: theoretical analysis and example proof. Secondly, aiming at the rationality of principal components degree, an evaluation method for pre-processing data before constructing MPCI using PCA is proposed. The pre-processing of data is mainly to standardize the specification interval of quality characteristics making the principal components degree more reasonable and optimizes the process capability evaluation method. Finally, the effectiveness and feasibility of the method are proved by an application example.

Download Full-text

Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA-seq Data

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-94968-0_32 ◽

2018 ◽

pp. 335-346

Author(s):

Krzysztof Gogolewski ◽

Maciej Sykulski ◽

Neo Christopher Chung ◽

Anna Gambin

Keyword(s):

Principal Component Analysis ◽

Noise Reduction ◽

Single Cell ◽

Principal Component ◽

Component Analysis ◽

Rna Seq ◽

Robust Principal Component Analysis

Download Full-text

Study of Body Conformation of Carpet Wool Type Chitarangi Sheep of India using Principal Component Analysis

Indian Journal of Animal Research ◽

10.18805/ijar.b-4285 ◽

2021 ◽

Author(s):

A.K. Mishra ◽

Anand Jain ◽

S. Singh ◽

R.K. Pundir

Keyword(s):

Principal Component Analysis ◽

Body Weight ◽

Total Variation ◽

Principal Components ◽

Principal Component ◽

Component Analysis ◽

Sheep Population ◽

Body Conformation ◽

Face Length ◽

Minimum Number

Background: The principal component analysis is applied to identify minimum number of combined variables that account for maximum portion of the variance existing in all variables studied. Chitarangi is a lesser known carpet type wool sheep distributed in Fazilka and Muktsar districts of Punjab, Sri Ganganagar district of Rajasthan and the adjoining areas. The information on body biometry is a prerequisite to characterize the lesser known sheep population available in the country. Hence, it is important to describe the body conformation by recording minimum number of biometric traits. Methods: Body biometry traits of Chitarangi sheep, a lesser known carpet quality wool producing sheep population were studied using Principal Component Analysis. The traits studied were body length (BL), height at wither (HW), chest girth (CG), paunch girth (PG), ear length (EL), face length (FL), face width (FW), tail length (TL) and adult body weight (BW). The data were collected on 297 ewes in the breeding tract of Chitarangi sheep. The descriptive statistics were determined for all the traits. The phenotypic correlations between different body biometric traits were estimated using partial correlations. Principal components were estimated using correlation matrix. Principal component analysis (PCA), a multivariate approach, is used when the recorded traits are highly correlated. Rotation of principal components was through the transformation of the components to approximate a simple structure. Factor analysis using oblique (promax) rotation was used. All the analysis was carried out using the SPSS statistical package. Result: The averages for body weight and biometry traits confirmed large size of Chitarangi animals. Most of the phenotypic correlations amongst the studied traits were positive and significant (p less than 0.01). The three components extracted from nine principal components accounted for 69.06% of the total variance. The first component, which described body size of ewes, accounted for 43.68% of the total variation with high loading for BW, CG, PG, HW, BL and FL. The components two and three explained 13.54 and 11.83% of total variance, respectively. The communalities ranged from 0.490 (FL) to 0.888 (PG). The lower communalities for face length indicated lower contribution of the trait to explain the total variation than others. The study indicates that principal components provided a means of reduction in number of biometric traits to explain body confirmation of adult female Chitarangi sheep.

Download Full-text