A Proposition of Generalized Method for Forward Selection of Variables

Abstract. Diatoms play a key role in the development of quantitative methods for environmental reconstruction in lake ecosystems. Diatom-based calibration datasets developed during the last decades allow the inference of past limnological variables such as TP, pH or conductivity and provide information on the autecology and distribution of diatom taxa. However, little is known about the relationships between diatoms and climatic or geographic factors. The response of surface sediment diatom assemblages to abiotic factors is usually examined using canonical correspondence analysis (CCA) and subsequent forward selection of variables based on Monte Carlo permutation tests that show the set of predictors best explaining the distributions of diatom species. The results reported in 40 previous studies using this methodology in different regions of the world are re-analyzed in this paper. Bi- and multivariate statistics (canonical correlation and two-block partial least-squares) were used to explore the correspondence between physical, chemical and physiographical factors and the variables that explain most of the variance in the diatom datasets. Results show that diatom communities respond mainly to chemical variables (pH, nutrients) with lake depth being the most important physiographical factor. However, the relative importance of certain parameters varied along latitudinal and trophic gradients. Canonical analyses demonstrated a strong concordance with regard to the predictor variables and the amount of variance they captured, suggesting that, on a broad scale, lake diatoms give a robust indication of past and present environmental conditions.

Download Full-text

Environmental parameters affecting composition of modern Mediterranean planktonic foraminifera assemblages

10.5194/egusphere-egu2020-20896 ◽

2020 ◽

Author(s):

Lucía A. Azibeiro ◽

Michal Kucera ◽

Lukas Jonkers ◽

Francisco J. Sierro ◽

Angela Cloke-Hayes

Keyword(s):

Redundancy Analysis ◽

Planktonic Foraminifera ◽

Explanatory Power ◽

Chlorophyll Concentration ◽

Environmental Parameters ◽

Forward Selection ◽

Planktonic Foraminifer ◽

Water Column Stratification ◽

Selection Of Variables ◽

Selection Of

La reconstrucci&#243;n de la temperatura de la superficie del mar (TSM) ha estado durante mucho tiempo en el centro de la investigaci&#243;n paleoceanogr&#225;fica. Los estudios en el Mediterr&#225;neo no han sido una excepci&#243;n, ya que la reconstrucci&#243;n cuantitativa de TSM en esta cuenca semicerrada es crucial para comprender el cambio clim&#225;tico pasado en la regi&#243;n. Muchos de estos m&#233;todos se basaron en foramin&#237;feros planct&#243;nicos, tanto en su geoqu&#237;mica de caparaz&#243;n como en la composici&#243;n de los ensamblajes (por ejemplo, funciones de transferencia). Comprender y modelar las relaciones entre el censo actual y las variables ambientales es la base para transformar los datos f&#243;siles en estimaciones cuantitativas de estas variables. Aunque globalmente, los conjuntos de foramin&#237;feros parecen estar determinados principalmente por la temperatura, en cuencas marginales como el Mediterr&#225;neo,&#160;In this study we attempt to determine which environmental parameters may control the variability of planktonic foraminifer assemblages in the modern Mediterranean. For this purpose, census counts of planktonic foraminifera assemblages from Mediterranean coretops (ForCenS data base) have been integrated with monthly estimates of SST, chlorophyll concentration, and vertical gradients of various parameters as proxies for water column stratification/mixing (WOA 1998). &#160;Redundancy Analysis (RDA) was used to evaluating the explanatory power and the collinearity among tested environmental parameters and a forward selection of variables was carried out to identify those explaining independently the largest share of the variance in the composition of planktonic foraminifera assemblages.Se identificaron nueve variables significativas. Tres de ellos corresponden a TSM, mientras que los otros seis se distribuyen entre las concentraciones de clorofila superficial (2) y los gradientes t&#233;rmicos verticales (4). Las variables m&#225;s explicativas son la TSM de junio (R 2 0.43) y el gradiente t&#233;rmico vertical de diciembre (R 2 0.15).

Download Full-text

Faculty Opinions recommendation of Forward selection of explanatory variables.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1123331.580486 ◽

2008 ◽

Author(s):

Ary Hoffmann

Keyword(s):

Forward Selection ◽

Explanatory Variables ◽

Selection Of

Download Full-text

Hierarchical selection of variables in sparse high-dimensional regression

Institute of Mathematical Statistics Collections - Borrowing Strength: Theory Powering Applications – A Festschrift for Lawrence D. Brown ◽

10.1214/10-imscoll605 ◽

2010 ◽

pp. 56-69 ◽

Cited By ~ 4

Author(s):

Peter J. Bickel ◽

Ya’acov Ritov ◽

Alexandre B. Tsybakov

Keyword(s):

High Dimensional ◽

Selection Of Variables ◽

High Dimensional Regression ◽

Hierarchical Selection ◽

Selection Of

Download Full-text

State of the art in selection of variables and functional forms in multivariable analysis—outstanding issues

Diagnostic and Prognostic Research ◽

10.1186/s41512-020-00074-3 ◽

2020 ◽

Vol 4 (1) ◽

Cited By ~ 6

Author(s):

Willi Sauerbrei ◽

◽

Aris Perperoglou ◽

Matthias Schmid ◽

Michal Abrahamowicz ◽

...

Keyword(s):

State Of The Art ◽

Multivariable Analysis ◽

Selection Of Variables ◽

Functional Forms ◽

Selection Of

Download Full-text

117 On the selection of variables representing the physiological state of cell cultures

Control Engineering Practice ◽

10.1016/0967-0661(93)91512-u ◽

1993 ◽

Vol 1 (4) ◽

pp. 739

Keyword(s):

Cell Cultures ◽

Physiological State ◽

Selection Of Variables ◽

Selection Of

Download Full-text

A bayesian predictive approach to the selection of variables in multiple regression

Communication in Statistics- Theory and Methods ◽

10.1080/03610928308828550 ◽

1983 ◽

Vol 12 (13) ◽

pp. 1553-1557 ◽

Cited By ~ 1

Author(s):

Ramona L. Trader

Keyword(s):

Multiple Regression ◽

Selection Of Variables ◽

Predictive Approach ◽

Selection Of

Download Full-text

Investigating Investment, Inflation, Trade-Openness and Economic Progress Nexus for Pakistan

Global Management Sciences Review ◽

10.31703/gmsr.2020(v-iv).01 ◽

2020 ◽

Vol V (IV) ◽

pp. 1-9

Author(s):

Aftab Anwar ◽

Muhammad Masood Anwar ◽

Ghulam Yahya Khan

Keyword(s):

Economic Growth ◽

Economic Performance ◽

Trade Openness ◽

Short Term ◽

Economic Progress ◽

Critical Measure ◽

Selection Of Variables ◽

Guide Lines ◽

Selection Of

Since inflation and trade openness rate are considered as critical measure of an economy's health. This article analyze the relation of Economic growth with Investment, Inflation and Trade Openness of Pakistan for 1970- 2019. The policy guide lines from analysis include promotion of policies to increase Investment and Trade-openness in short and long-terms. The study used ARDL bound-testing for long-term and Un-Restricted-Error Correction techniques to discover short-term interrelation amongst a selection of variables. Results of study revealed inflation negatively related to economic performance and positively linked to Investment and Trade-Openness. Findings of enquiry suggested government should focus more on investment friendly policies in the country.

Download Full-text

Problems in using p-curve analysis and text-mining to detect rate of p-hacking

10.7287/peerj.preprints.1266v3 ◽

2015 ◽

Author(s):

Dorothy V Bishop ◽

Paul A Thompson

Keyword(s):

Text Mining ◽

Curve Analysis ◽

P Values ◽

Evidential Value ◽

Selection Of Variables ◽

Dependent Variables ◽

Selection Of ◽

The Way

Background: The p-curve is a plot of the distribution of p-values below .05 reported in a set of scientific studies. Comparisons between ranges of p-values have been used to evaluate fields of research in terms of the extent to which studies have genuine evidential value, and the extent to which they suffer from bias in the selection of variables and analyses for publication, p-hacking. We argue that binomial tests on the p-curve are not robust enough to be used for this purpose. Methods: P-hacking can take various forms. Here we used R code to simulate the use of ghost variables, where an experimenter gathers data on several dependent variables but reports only those with statistically significant effects. We also examined a text-mined dataset used by Head et al. (2015) and assessed its suitability for investigating p-hacking. Results: We first show that a p-curve suggestive of p-hacking can be obtained if researchers misapply parametric tests to data that depart from normality, even when no p-hacking occurs. We go on to show that when there is ghost p-hacking, the shape of the p-curve depends on whether dependent variables are intercorrelated. For uncorrelated variables, simulated p-hacked data do not give the "p-hacking bump" just below .05 that is regarded as evidence of p-hacking, though there is a negative skew when simulated variables are inter-correlated. The way p-curves vary according to features of underlying data poses problems when automated text mining is used to detect p-values in heterogeneous sets of published papers. Conclusions: A significant bump in the p-curve just below .05 is not necessarily evidence of p-hacking, and lack of a bump is not indicative of lack of p-hacking. Furthermore, while studies with evidential value will usually generate a right-skewed p-curve, we cannot treat a right-skewed p-curve as an indicator of the extent of evidential value, unless we have a model specific to the type of p-values entered into the analysis. We conclude that it is not feasible to use the p-curve to estimate the extent of p-hacking and evidential value unless there is considerable control over the type of data entered into the analysis.

Download Full-text

Selection of Variables in Regression Models Based on Inflated Distributions

Pakistan Journal of Statistics and Operation Research ◽

10.18187/pjsor.v7i2-sp.300 ◽

2011 ◽

Vol 7 (2-Sp) ◽

Author(s):

Aruna Rao ◽

Sumathi K

Keyword(s):

Regression Models ◽

Selection Of Variables ◽

Selection Of

Download Full-text