Prediction Limits of Allometric Equations: A Reanalysis of Ryder's Morphoedaphic Index

David C. Schneider; Richard L. Haedrich

doi:10.1139/f89-067

Prediction Limits of Allometric Equations: A Reanalysis of Ryder's Morphoedaphic Index

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f89-067 ◽

1989 ◽

Vol 46 (3) ◽

pp. 503-508 ◽

Cited By ~ 15

Author(s):

David C. Schneider ◽

Richard L. Haedrich

Keyword(s):

Unit Area ◽

Data Sets ◽

Small Lakes ◽

Independent Variables ◽

Prediction Limit ◽

Lake Volume ◽

Upper Limits ◽

Morphoedaphic Index ◽

Prediction Limits ◽

Fish Harvest

Empirical equations are frequently used to predict potential fish harvest from lakes or reservoirs. The prediction limits around these estimates have not been evaluated. We found that prediction limits from Ryder's morphoedaphic equation were inaccurate because of artificial correlation between dependent and independent variables used in regression. Furthermore there are inconsistencies between different versions of the model. Reanalysis of Ryder's data, gave upper 95% prediction limits that were 4.8–5.9 times the lower 95% prediction limit. Separating the component variables of the morphoedaphic index reduced prediction limits slightly. New upper limits were 4.5–5.6 times the lower 95% limits. Theoretical scaling of fish harvest on lake volume, consistent with the known tendency of small lakes to yield more fish per unit area, reduced the upper prediction limits to 2.8 times the lower limit. The model, which was allometric in form, correctly related the exponent relating harvest to lake volume in two independent data sets.

A Statistically Valid Model of the Morphoedaphic Index

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f91-230 ◽

1991 ◽

Vol 48 (10) ◽

pp. 1937-1943 ◽

Cited By ~ 14

Author(s):

Robert S. Rempel ◽

Peter J. Colby

Keyword(s):

Data Sets ◽

Annual Fish ◽

Fish Yield ◽

Lake Volume ◽

Morphoedaphic Index ◽

Partitioned Data ◽

Continued Use ◽

Widespread Access ◽

Ratio Variables

The morphoedaphic index (MEI) has been criticized because of the use of ratio variables in linear regression. Computationally simple, the continued use of the index is questionable given the widespread access fisheries biologists now have to computerized statistical packages. We present a statistically valid analogue to the MEI, the morphoedaphic model (MEM), that utilizes multiple regression to characterize the morphometric and fertility properties of lakes to predict annual fish yield. Surface area, lake volume, and total dissolved solids (TDS) are used to predict annual fish yield for the lake and to derive associated confidence limits. Predicted yield of the newly derived model was compared with predictions from the original MEI Comparisons were also made based on models derived from Ontario sport and commercial fisheries data sets. The MEM derived from these partitioned data sets more accurately modelled the observed long-term yields for these lakes. Analysis of the remaining outliers suggests that several additional variables and stratification may be required to further develop the precision of the statistical model.

Informative Frailty Indices from Binarized Biomarkers

10.1101/2020.01.07.20016816 ◽

2020 ◽

Author(s):

Garrett Stubbings ◽

Spencer Farrell ◽

Arnold Mitnitski ◽

Kenneth Rockwood ◽

Andrew Rutenberg

Keyword(s):

Medical Knowledge ◽

Data Sets ◽

Canadian Study ◽

Adverse Health Outcomes ◽

Health And Nutrition ◽

Health Measure ◽

Upper Limits ◽

Insight Into ◽

Better Than ◽

Biomarker Data

AbstractFrailty indices (FI) based on continuous valued health data, such as obtained from blood and urine tests, have been shown to be predictive of adverse health outcomes. However, creating FI from such biomarker data requires a binarization treatment that is difficult to standardize across studies. In this work, we explore a “quantile” methodology for the generic treatment of biomarker data that allows us to construct an FI without preexisting medical knowledge (i.e. risk thresholds) of the included biomarkers. We show that our quantile approach performs as well as, or even slightly better than, established methods for the National Health and Nutrition Examination Survey (NHANES) and the Canadian Study of Health and Aging (CSHA) data sets. Furthermore, we show that our approach is robust to cohort effects within studies as compared to other data-based methods. The success of our binarization approaches provides insight into the robustness of the FI as a health measure, the upper limits of the FI observed in various data sets, and highlights general difficulties in obtaining absolute scales for comparing FI between studies.

Mendel’s pea crosses: varieties, traits and statistics

Hereditas ◽

10.1186/s41065-019-0111-y ◽

2019 ◽

Vol 156 (1) ◽

Author(s):

T. H. Noel Ellis ◽

Julie M. I. Hofer ◽

Martin T. Swain ◽

Peter J. van Dijk

Keyword(s):

Statistical Analysis ◽

Normal Distribution ◽

Standard Normal Distribution ◽

Data Sets ◽

Crossing Experiments ◽

Independent Variables ◽

F2 Population ◽

A Value ◽

Segregation Ratios ◽

Standard Normal

Abstract A controversy arose over Mendel’s pea crossing experiments after the statistician R.A. Fisher proposed how these may have been performed and criticised Mendel’s interpretation of his data. Here we re-examine Mendel’s experiments and investigate Fisher’s statistical criticisms of bias. We describe pea varieties available in Mendel’s time and show that these could readily provide all the material Mendel needed for his experiments; the characters he chose to follow were clearly described in catalogues at the time. The combination of character states available in these varieties, together with Eichling’s report of crosses Mendel performed, suggest that two of his F3 progeny test experiments may have involved the same F2 population, and therefore that these data should not be treated as independent variables in statistical analysis of Mendel’s data. A comprehensive re-examination of Mendel’s segregation ratios does not support previous suggestions that they differ remarkably from expectation. The χ2 values for his segregation ratios sum to a value close to the expectation and there is no deficiency of extreme segregation ratios. Overall the χ values for Mendel’s segregation ratios deviate slightly from the standard normal distribution; this is probably because of the variance associated with phenotypic rather than genotypic ratios and because Mendel excluded some data sets with small numbers of progeny, where he noted the ratios “deviate not insignificantly” from expectation.

Evaluation of Algorithms for Combining Independent Data Sets in a Human Performance Expert System

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128703100728 ◽

1987 ◽

Vol 31 (7) ◽

pp. 811-814

Author(s):

Valerie J. Gawron ◽

David J. Travale ◽

Colin Drury ◽

Sara Czaja

Keyword(s):

Reaction Time ◽

Expert System ◽

Human Performance ◽

Test Procedure ◽

Reaction Time Task ◽

Error Rates ◽

Task Completion ◽

Data Sets ◽

Independent Data ◽

Independent Variables

A major problem facing system designers today is predicting human performance in: 1) systems that have not yet been built, 2) situations that have not yet been experienced, and 3) situations for which there are only anecdotal reports. To address this problem, the Human Performance Expert System (Human) was designed. The system contains a large data base of equations derived from human performance research reported in the open literature. Human accesses these data to predict task performance times, task completion probabilities, and error rates. A problem was encountered when multiple independent data sets were relevant to one task. For example, a designer is interested in the effects of luminance and front size on number of reading errors. Two data sets exist in the literature: one examining the effects of luminance, the other, font size. The data in the two sets were collected at different locations with different subjects and at different times in history. How can the two data sets be combined to address the designer's problem? Four combining algorithms were developed and then tested in two steps. In step one, two reaction-time experiments were conducted: one to evaluate the effect the number of alternatives on reaction time; the second, signals per minute and number of displays being monitored. The four algorithms were used on the data from these two experiments to predict reaction time in the situation where all three independent variables are manipulated simultaneously. In step two of the test procedure, a third experiment was conducted. Subjects who had not participated in either Experiment One or Two performed a reaction-time task under the combined effects of all three independent variables. The predictions made from step one were compared to the actual empirical data collected in step two. The results of these comparisons are presented.

What Happened to the CBD-Distance Gradient?: Land Values in a Policentric City

Environment and Planning A Economy and Space ◽

10.1068/a210221 ◽

1989 ◽

Vol 21 (2) ◽

pp. 221-232 ◽

Cited By ~ 173

Author(s):

E Heikkila ◽

P Gordon ◽

J I Kim ◽

R B Peiser ◽

H W Richardson ◽

...

Keyword(s):

Los Angeles ◽

Neighborhood Effects ◽

Unit Area ◽

House Price ◽

Land Values ◽

Hedonic Regression ◽

Independent Variables ◽

Regression Methods ◽

Major Exception ◽

The Impact

Hedonic regression methods are used to assess the impact of dwelling and structure characteristics, neighborhood effects, and multiple locations on a sample of almost 11000 residential property sales in Los Angeles County in 1980. Correction for the dwelling characteristic permits the analysis to be interpreted in terms of land values rather than property values per unit area. The selected equation explains more than 93% of the variation in the dependent variable (house price per unit of lot area). All the independent variables (five property or transaction characteristics, four neighborhood effects, and ten locational nodes) are statistically significant, with one major exception: distance from the CBD, which has a very low /-value and an unexpected sign. This result should be considered in the context of many superficial references, based largely on visual symbols such as new office buildings, to a revival of downtown Los Angeles. The authors interpret the finding that eight subcenters have a statistically significant influence on metropolitan residential land values in Los Angeles as yet another indication of the demise of the monocentric model and the need to discuss VS metropolitan areas in policentric terms.

Relevant Independent Variables on MOBA Video Games to Train Machine Learning Algorithms

10.24132/csrn.2021.3101.19 ◽

2021 ◽

Author(s):

Juan Guillermo López Guzmán ◽

Cesar Julio Bustacara Medina

Keyword(s):

Machine Learning ◽

Video Games ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Multidimensional Data ◽

Data Sets ◽

Network Architectures ◽

Independent Variables ◽

Learning Techniques ◽

Multidimensional Data Sets

Popularity of Multiplayer Online Battle Arena (MOBA) video games has grown considerably, its popularity as well as the complexity of their playability, have attracted the attention in recent years of researchers from various areas of knowledge and in particular how they have resorted to different machine learning techniques. The papers reviewed mainly look for patterns in multidimensional data sets. Furthermore, these previous researches do not present a way to select the independent variables (predictors) to train the models. For this reason, this paper proposes a list of variables based on the techniques used and the objectives of the research. It allows to provide a set of variables to find patterns applied in MOBA videogames. In order to get the mentioned list, the consulted works were grouped by the used machine learning techniques, ranging from rule-based systems to complex neural network architectures. Also, a grouping technique is applied based on the objective of each research proposed.

Synthesizing Observations and Theory to Understand Galactic Magnetic Fields: Progress and Challenges

Galaxies ◽

10.3390/galaxies8010004 ◽

2019 ◽

Vol 8 (1) ◽

pp. 4 ◽

Cited By ~ 11

Author(s):

Rainer Beck ◽

Luke Chamandy ◽

Ed Elson ◽

Eric G. Blackman

Keyword(s):

Magnetic Fields ◽

Large Scale ◽

Mean Field ◽

Large Data ◽

Data Sets ◽

Dynamo Theory ◽

Galactic Magnetic Fields ◽

Ngc 6946 ◽

Upper Limits ◽

New Instruments

Constraining dynamo theories of magnetic field origin by observation is indispensable but challenging, in part because the basic quantities measured by observers and predicted by modelers are different. We clarify these differences and sketch out ways to bridge the divide. Based on archival and previously unpublished data, we then compile various important properties of galactic magnetic fields for nearby spiral galaxies. We consistently compute strengths of total, ordered, and regular fields, pitch angles of ordered and regular fields, and we summarize the present knowledge on azimuthal modes, field parities, and the properties of non-axisymmetric spiral features called magnetic arms. We review related aspects of dynamo theory, with a focus on mean-field models and their predictions for large-scale magnetic fields in galactic discs and halos. Furthermore, we measure the velocity dispersion of H i gas in arm and inter-arm regions in three galaxies, M 51, M 74, and NGC 6946, since spiral modulation of the root-mean-square turbulent speed has been proposed as a driver of non-axisymmetry in large-scale dynamos. We find no evidence for such a modulation and place upper limits on its strength, helping to narrow down the list of mechanisms to explain magnetic arms. Successes and remaining challenges of dynamo models with respect to explaining observations are briefly summarized, and possible strategies are suggested. With new instruments like the Square Kilometre Array (SKA), large data sets of magnetic and non-magnetic properties from thousands of galaxies will become available, to be compared with theory.

Non-coincident inter-instrument comparisons of ozone measurements using quasi-conservative coordinates

Atmospheric Chemistry and Physics Discussions ◽

10.5194/acpd-4-4383-2004 ◽

2004 ◽

Vol 4 (4) ◽

pp. 4383-4406

Author(s):

L. R. Lait ◽

P. A. Newman ◽

M. R. Schoeberl ◽

T. McGee ◽

L. Twigg ◽

...

Keyword(s):

The Other ◽

Coordinate Space ◽

Mapping Technique ◽

Data Sets ◽

Time Varying ◽

Positive Bias ◽

Altitude Range ◽

Ozone Data ◽

Upper Limits ◽

Good Agreement

Abstract. Ozone measurements from ozonesondes, AROTAL, DIAL, and POAM III instruments during the SOLVE-2/VINTERSOL period are composited in a time-varying, flow-following quasi-conservative (PV-θ) coordinate space; the resulting composites from each instrument are mapped onto the other instruments' locations and times. The mapped data are then used to intercompare data from the different instruments. Overall, the four ozone data sets are found to be in good agreement. AROTAL shows somewhat lower values below 16 km, and DIAL has a positive bias at the upper limits of its altitude range. These intercomparisons are consistent with those obtained from more conventional near-coincident profiles, where available. Although the PV-θ mapping technique entails larger uncertainties of individual profile differences compared to direct near-coincident comparisons, the ability to include much larger numbers of comparisons can make this technique advantageous.

Clear-Water Abutment Scour Prediction for Simple and Complex Channels

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1797-03 ◽

2002 ◽

Vol 1797 (1) ◽

pp. 23-30

Author(s):

J.R. Richardson ◽

Robert Trivino

Keyword(s):

Laboratory Data ◽

Power Transformation ◽

Data Sets ◽

Predictive Equation ◽

Clear Water ◽

Predictive Equations ◽

Momentum Ratio ◽

Independent Variables ◽

Transformation Procedure ◽

Abutment Scour

One hundred and sixty-one laboratory and one field clear-water abutment scour data sets were regressed using a Box-Tidwell power transformation procedure. The predictive equation was verified using additional laboratory data for compound channels. A momentum ratio term was used to account for flow redistribution, differences in overbank geometries, and scale sizes. The regression identified the most applicable and dominant independent variables that affect the magnitude of clear-water abutment scour. The resulting equation has a significantly lower standard error of estimate than previously published equations. This formulation is more robust and accurately predicted abutment scour depth at both laboratory and prototype scale. With additional refinement, improved abutment scour predictive equations can be realized by using alternative regression procedures and including the momentum ratio to buffer the effect of abutment length at large prototype scales.

By Using Logarithmic Regression and Artificial Neural Network to Improve Prediction Model of Dead Number Resulted from Road Traffic Accidents in Turkey

Karaelmas Science and Engineering Journal ◽

10.7212/zkufbd.v8i2.1101 ◽

2018 ◽

pp. 446-453

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Prediction Model ◽

Traffic Accidents ◽

Road Traffic ◽

Data Sets ◽

Damage Prediction ◽

Independent Variables ◽

Artificial Neural ◽

Artificial Neural Network Ann

Traffic accidents occurred on highway in Turkey cause materially and morally damage. To decrease the damage, prediction model developed. In this study, demographic and traffic data which from 1970 to 2007 are used. These data are consist of dependent and independent variables. Dependent variable is formed Number of Dead (ND). As for independent variables are comprised Population (P), Registered Number of Vehicle (VN), Vehicle-km (VK), Number of Drivers (DN). Models are developed using Artificial Neural Network (ANN) and Logarithmic Regression (LR) enhanced by Smeed. PVNVKDN model developed taking real values logarithm is the best performance of models in LR technique. VKDN created by using historical data sets is the best model in ANN technique. As for models created by randomly selected data, the best model is VKDN. When performances of best models are compared, VKDN is the best model because of lowest error rate.