Effective dimensionality of environmental indicators: a principal component analysis with bootstrap confidence intervals

AbstractBackgroundThe health-risk assessment paradigm is shifting from single stressor evaluation towards cumulative assessments of multiple stressors. Recent efforts to develop broad-scale public health hazard datasets provide an opportunity to develop and evaluate multiple exposure hazards in combination.MethodsWe performed a multivariate study of the spatial relationship between 12 indicators of environmental hazard, 5 indicators of socioeconomic hardship, and 3 health outcomes. Indicators were obtained from CalEnviroScreen (version 3.0), a publicly available environmental justice screening tool developed by the State of California Environmental Protection Agency. The indicators were compared to the total rate of hospitalization for 14 ICD-9 disease categories (a measure of disease burden) at the zip code tabulation area population level. We performed principal component analysis to visualize and reduce the CalEnviroScreen data and spatial autoregression to evaluate associations with disease burden.ResultsCalEnviroScreen was strongly associated with the first principal component (PC) from a principal component analysis (PCA) of all 20 variables (Spearman ρ = 0.95). In a PCA of the 12 environmental variables, two PC axes explained 43% of variance, with the first axis indicating industrial activity and air pollution, and the second associated with ground-level ozone, drinking water contamination and PM2.5. Mass of pesticides used in agriculture was poorly or negatively correlated with all other environmental indicators, and with the CalEnviroScreen calculation method, suggesting a limited ability of the method to capture agricultural exposures. In a PCA of the 5 socioeconomic variables, the first PC explained 66% of variance, representing overall socioeconomic hardship. In simultaneous autoregressive models, the first environmental and socioeconomic PCs were both significantly associated with the disease burden measure, but more model variation was explained by the socioeconomic PCs.ConclusionsThis study supports the use of CalEnviroScreen for its intended purpose of screening California regions for areas with high environmental exposure and population vulnerability. Study results further suggest a hypothesis that, compared to environmental pollutant exposure, socioeconomic status has greater impact on overall burden of disease.

Download Full-text

Modified Principal Component Analysis for Identifying Key Environmental Indicators and Application to a Large-Scale Tidal Flat Reclamation

Water ◽

10.3390/w10010069 ◽

2018 ◽

Vol 10 (1) ◽

pp. 69 ◽

Cited By ~ 3

Author(s):

Kejian Chu ◽

Wenjuan Liu ◽

Yuntong She ◽

Zulin Hua ◽

Min Tan ◽

...

Keyword(s):

Principal Component Analysis ◽

Tidal Flat ◽

Large Scale ◽

Principal Component ◽

Component Analysis ◽

Environmental Indicators

Download Full-text

Effective dimensionality for principal component analysis of time series expression data

Biosystems ◽

10.1016/s0303-2647(03)00128-x ◽

2003 ◽

Vol 71 (3) ◽

pp. 311-317 ◽

Cited By ~ 9

Author(s):

Michael Hörnquist ◽

John Hertz ◽

Mattias Wahde

Keyword(s):

Principal Component Analysis ◽

Time Series ◽

Principal Component ◽

Component Analysis ◽

Expression Data ◽

Series Expression ◽

Effective Dimensionality ◽

Analysis Of Time Series

Download Full-text

Geometrical Approximated Principal Component Analysis for Hyperspectral Image Analysis

Remote Sensing ◽

10.3390/rs12111698 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1698 ◽

Cited By ~ 4

Author(s):

Alina L. Machidon ◽

Fabio Del Frate ◽

Matteo Picchiani ◽

Octavian M. Machidon ◽

Petre L. Ogrutan

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Hyperspectral Image ◽

Principal Component ◽

Component Analysis ◽

Hyperspectral Images ◽

Land Classification ◽

Hyperspectral Image Processing ◽

Hyperspectral Image Analysis ◽

Effective Dimensionality

Principal Component Analysis (PCA) is a method based on statistics and linear algebra techniques, used in hyperspectral satellite imagery for data dimensionality reduction required in order to speed up and increase the performance of subsequent hyperspectral image processing algorithms. This paper introduces the PCA approximation method based on a geometric construction approach (gaPCA) method, an alternative algorithm for computing the principal components based on a geometrical constructed approximation of the standard PCA and presents its application to remote sensing hyperspectral images. gaPCA has the potential of yielding better land classification results by preserving a higher degree of information related to the smaller objects of the scene (or to the rare spectral objects) than the standard PCA, being focused not on maximizing the variance of the data, but the range. The paper validates gaPCA on four distinct datasets and performs comparative evaluations and metrics with the standard PCA method. A comparative land classification benchmark of gaPCA and the standard PCA using statistical-based tools is also described. The results show gaPCA is an effective dimensionality-reduction tool, with performance similar to, and in several cases, even higher than standard PCA on specific image classification tasks. gaPCA was shown to be more suitable for hyperspectral images with small structures or objects that need to be detected or where preponderantly spectral classes or spectrally similar classes are present.

Download Full-text

A Lifecycle Sustainability Assessment of CO2 Emissions, Energy Consumption and Social Aspects of Methylic and Ethylic Biodiesel Using Principal Component Analysis

Materials Science Forum ◽

10.4028/www.scientific.net/msf.965.1 ◽

2019 ◽

Vol 965 ◽

pp. 1-12

Author(s):

Stefano Ferrari Interlenghi ◽

José Luiz de Medeiros ◽

Ofélia de Queiroz Fernandes Araújo

Keyword(s):

Principal Component Analysis ◽

Energy Consumption ◽

Energy Loss ◽

Principal Component ◽

Component Analysis ◽

Environmental Indicators ◽

Biodiesel Production ◽

Social Aspects ◽

Carbon Intensity ◽

Production Chain

The possibility of using renewable feedstocks for biodiesel production and reducing gas emissions makes it an attractive large-scale substitute to traditional fossil diesel. Although renewability is one of the main driving forces in biodiesel use, traditional production routes employ methanol as the transesterification agent, a chemical generated from fossil carbon. Aiming at further improving biodiesel’s sustainable performance, the replacement of methanol by ethanol has been proposed. Use of the ethylic production route could further reduce CO2 emissions, energy consumption and generate more jobs. The objective of this study is to unveil whether substituting methanol for ethanol does indeed result in a less carbon and energy intensive production chain while also increasing job generation and decreasing social strife. To assess production chain performance a lifecycle approach was used composed by: (i) Data assemblage from literature to represent the ethylic/methylic biodiesel systems; (ii) Construction of quantitative indicators to compare material and energetic flows; and (iii) Principal Component Analysis (PCA) for data interpretation and relevance ranking of calculated social/environmental indicators. Focus was given to CO2 emissions, energy consumption and social aspects of sustainability. Results show that use of ethanol does indeed reduce CO2 emissions, due to extra agricultural carbon sinks in the production chain but increases energy consumption and energy loss. Methanol also resulted in a chain with higher average wages, more jobs generated and less forced labor cases but with a higher accident rate and a high salary disparity. PCA showed that carbon intensity is one of the most important environmental metrics while energy consumption was considered secondary, but the high correlation between these aspects highly impact chain sustainability. PCA also greatly differentiated agricultural and industrial links of respective production chains, with industrial links being governed by CO2 emissions and process safety and agricultural links by water consumption, land use and energy loss. A distinct tradeoff was seen between environmental and social considerations of sustainability and between carbon intensity and energy consumption reductions. As a result, substitution is only justified in scenarios in which CO2 emissions outweigh energy intensity and social aspects.

Download Full-text

The Uncertainty and Robustness of the Principal Component Analysis as a Tool for the Dimensionality Reduction

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.235.1 ◽

2015 ◽

Vol 235 ◽

pp. 1-8

Author(s):

Jacek Pietraszek ◽

Ewa Skrzypczak-Pietraszek

Keyword(s):

Experimental Data ◽

Principal Component Analysis ◽

Dimensionality Reduction ◽

Confidence Intervals ◽

Dimensional Space ◽

Experimental Studies ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Data Points

Experimental studies very often lead to datasets with a large number of noted attributes (observed properties) and relatively small number of records (observed objects). The classic analysis cannot explain recorded attributes in the form of regression relationships due to lack of sufficient number of data points. One of method making available a filtering of unimportant attributes is an approach known as ‘dimensionality reduction’. Well-known example of such approach is principal component analysis (PCA) which transforms the data from the high-dimensional space to a space of fewer dimensions and gives heuristics to select least but necessary number of dimensions. Authors used such technique successfully in their previous investigations but a question arose: whether PCA is robust and stable? This paper tries to answer this question by re-sampling experimental data and observing empirical confidence intervals of parameters used to make decision in PCA heuristics.

Download Full-text