An analysis of error structure in modeling the stock–recruitment data of gadoid stocks using generalized linear models

2004 ◽  
Vol 61 (1) ◽  
pp. 134-146 ◽  
Author(s):  
Yan Jiao ◽  
David Schneider ◽  
Yong Chen ◽  
Joe Wroblewski

When modeling the stock–recruitment (S–R) relationship, the Cushing, Ricker, and other S–R models are fitted to the observed S–R data by estimating parameters with assumptions made concerning the model error structure. Using a generalized linear model approach, we explored and identified the appropriate model error structure in modeling S–R data for gadoid stocks. The S–R parameter estimation was found to be influenced by the choice of error distributions assumed in the analysis. In modeling S–R data for gadoid stocks, the Beverton–Holt model was found to be more sensitive to the assumption of model error distribution than the Cushing and Ricker models. The lognormal and gamma distributions had higher probability of being acceptable model error distributions. Cluster analyses and summary statistics of error distributions in S–R modeling did not show consistent patterns in the identification of an acceptable model error structure among species, geographic distributions, and sample sizes. A better understanding of the factors and mechanisms resulting in differences in the choice of appropriate model error distributions for different populations is needed in future research. We recommend that the generalized linear model be used to identify acceptable model error structures in quantifying S–R relationships.

2004 ◽  
Vol 61 (1) ◽  
pp. 122-133 ◽  
Author(s):  
Yan Jiao ◽  
Yong Chen ◽  
David Schneider ◽  
Joe Wroblewski

Stock–recruitment (S–R) models are commonly fitted to S–R data with a least-squares method. Errors in modeling are usually assumed to be normal or lognormal, regardless of whether such an assumption is realistic. A Monte Carlo simulation approach was used to evaluate the impact of the assumption of error structure on S–R modeling. The generalized linear model, which can readily deal with different error structures, was used in estimating parameters. This study suggests that the quality of S–R parameter estimation, measured by estimation errors, can be influenced by the realism of error structure assumed in an estimation, the number of S–R data points, and the number of outliers in modeling. A small number of S–R data points and the presence of outliers in S–R data could increase the difficulty in identifying an appropriate error structure in modeling, which might lead to large biases in the S–R param eter estimation. This study shows that generalized linear model methods can help identify an appropriate error distribution in S–R modeling, leading to an improved estimation of parameters even when there are outliers and the number of S–R data points is small. We recommend the generalized linear model be used for quantifying stock–recruitment relationships.


2016 ◽  
Vol 66 (3) ◽  
pp. 317-335
Author(s):  
Daniel Zaborski ◽  
Witold Stanisław Proskura ◽  
Katarzyna Wojdak-Maksymiec ◽  
Wilhelm Grzesiak

AbstractThe aim of the present study was to: 1) check whether it would be possible to detect cows susceptible to mastitis at an early stage of their utilization based on selected genotypes and basic production traits in the first three lactations using ensemble data mining methods (boosted classification tress – BT and random forest – RF), 2) find out whether the inclusion of additional production variables for subsequent lactations will improve detection performance of the models, 3) identify the most significant predictors of susceptibility to mastitis, and 4) compare the results obtained by using BT and RF with those for the more traditional generalized linear model (GLZ). A total of 801 records for Polish Holstein-Friesian Black-and-White cows were analyzed. The maximum sensitivity, specificity and accuracy of the test set were 72.13%, 39.73%, 55.90% (BT), 86.89%, 17.81%, 59.49% (RF) and 90.16%, 8.22%, 58.97% (GLZ), respectively. Inclusion of additional variables did not have a significant effect on the model performance. The most significant predictors of susceptibility to mastitis were: milk yield, days in milk, sire’s rank, percentage of Holstein-Friesian genes, whereas calving season and genotypes (lactoferrin, tumor necrosis factor alpha, lysozyme and defensins) were ranked much lower. The applied models (both data mining ones and GLZ) showed low accuracy in detecting cows susceptible to mastitis and therefore some other more discriminating predictors should be used in future research.


Geophysics ◽  
1975 ◽  
Vol 40 (5) ◽  
pp. 763-772 ◽  
Author(s):  
William R. Green

Geophysical inversion methods are most effective when applied to linear functionals: it is therefore advantageous to employ linear models for geophysical data. A two‐dimensional linear model consisting of many horizontal prisms has been developed for interpretation of gravity profiles. A Backus‐Gilbert inversion which finds the acceptable model “nearest” to an initial estimate can be rapidly computed; iterative application of the technique allows a single‐density model to be developed at a modest expense in computer time. Gravity data from the Guichon Creek batholith were inverted as a test of the method, with results comparable to those from a standard polygon model.


Author(s):  
Andrea Discacciati ◽  
Matteo Bottai

The instantaneous geometric rate represents the instantaneous probability of an event of interest per unit of time. In this article, we propose a method to model the effect of covariates on the instantaneous geometric rate with two models: the proportional instantaneous geometric rate model and the proportional instantaneous geometric odds model. We show that these models can be fit within the generalized linear model framework by using two nonstandard link functions that we implement in the user-defined link programs log_igr and logit_igr. We illustrate how to fit these models and how to interpret the results with an example from a randomized clinical trial on survival in patients with metastatic renal carcinoma.


Author(s):  
Shaolin Hu ◽  
Karl Meinke ◽  
Rushan Chen ◽  
Ouyang Huajiang

Iterative Estimators of Parameters in Linear Models with Partially Variant CoefficientsA new kind of linear model with partially variant coefficients is proposed and a series of iterative algorithms are introduced and verified. The new generalized linear model includes the ordinary linear regression model as a special case. The iterative algorithms efficiently overcome some difficulties in computation with multidimensional inputs and incessantly appending parameters. An important application is described at the end of this article, which shows that this new model is reasonable and applicable in practical fields.


Author(s):  
Michael Fosu Ofori ◽  
Stephen B. Twum ◽  
Jackson A. Y. Osborne

Background: Generalized Linear models are mostly fitted to data that are not correlated. However, very often data that are collected from health and epidemiological studies are correlated either as a result of the sampling methods or the randomness associated with the collection of such data. Therefore, fitting generalized linear models to such data that produce only fixed effects could lead to over dispersion in the model estimates. Objectives: The objective of this study is to fit both generalized linear and generalized linear mixed models to a correlated data and compare the results of the two models. Methods: Logistic regression is employed in fitting the generalized linear model since the dependent variable in the study is bivariate whilst the GLIMMIX model in SAS is used to fit the generalized linear mixed model. Results: The generalized linear model produces over dispersion with higher errors among the parameter estimates than the generalized linear mixed model. Conclusion: In dealing with a more correlated data, generalized linear mixed model, which can handle both fixed and random effects, is preferable to generalized linear model.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Salman Khan ◽  
Hayden T. Schilling ◽  
Mohammad Afzal Khan ◽  
Devendra Kumar Patel ◽  
Ben Maslen ◽  
...  

AbstractOtoliths are commonly used to discriminate between fish stocks, through both elemental composition and otolith shape. Typical studies also have a large number of elemental compositions and shape measures relative to the number of otolith samples, with these measures exhibiting strong mean–variance relationships. These properties make otolith composition and shape data highly suitable for use within a multivariate generalised linear model (MGLM) framework, yet MGLMs have never been applied to otolith data. Here we apply both a traditional distance based permutational multivariate analysis of variance (PERMANOVA) and MGLMs to a case study of striped snakehead (Channa striata) in India. We also introduce the Tweedie and gamma distributions as suitable error structures for the MGLMs, drawing similarities to the properties of Biomass data. We demonstrate that otolith elemental data and combined otolith elemental and shape data violate the assumption of homogeneity of variance of PERMANOVA and may give misleading results, while the assumptions of the MGLM with Tweedie and gamma distributions are shown to be satisfied and are appropriate for both otolith shape and elemental composition data. Consistent differences between three groups of C. striata were identified using otolith shape, otolith chemistry and a combined otolith shape and chemistry dataset. This suggests that future research should be conducted into whether there are demographic differences between these groups which may influence management considerations. The MGLM method is widely applicable and could be applied to any multivariate otolith shape or elemental composition dataset.


2021 ◽  
pp. 181-208
Author(s):  
Justin C. Touchon

Chapter 7 introduces one of the most useful statistical frameworks for the modern life scientist: the generalized linear model (GLM). GLMs extend the linear model to an array of non-normally distributed data such as Poisson, negative binomial, binomial, and Gamma distributed data. These models dramatically improve the breadth of data that can be properly analysed without resorting to non-parametric statistics. Using the same RxP dataset, readers learn how to assess the error distribution of their data and evaluate competing models to achieve the best, most robust analysis possible. Diagnostic plots and assessing model fit is continually taught as is how to interpret the model output and calculate summary statistics. Plotting non-normal error distributions with ggplot2 is taught, as is using the predict() function.


2012 ◽  
Vol 8 (16) ◽  
pp. 221-237 ◽  
Author(s):  
Jackelyne Gómez–Restrepo ◽  
Myladis Cogollo–Flórez

The detection of bank frauds is a topic which many financial sector companies have invested time and resources into. However, finding patterns in the methodologies used to commit fraud in banks is a job that primarily involves intimate knowledge of customer behavior, with the idea of isolating those transactions which do not correspond to what the client usually does. Thus, the solutions proposed in literature tend to focus on identifying outliersor groups, but fail to analyse each client or forecast fraud. This paper evaluates the implementation of a generalized linear model to detect fraud. With this model, unlike conventional methods, we consider the heterogeneity of customers. We not only generate a global model, but also a model for each customer which describes the behavior of each one according to their transactional history and previously detected fraudulent transactions. In particular, a mixed logistic model is used to estimate the probability that a transactionis fraudulent, using information that has been taken by the banking systems in different moments of time.


Sign in / Sign up

Export Citation Format

Share Document