Composition, biometry and statistical relationships between the cnidom and body size in the sea anemone Oulactis muscosa (Cnidaria: Actiniaria)

Author(s):  
F.H. Acuña ◽  
A.C. Excoffon ◽  
L. Ricci

This study analyses the possible relationships between body size and length of cnidae from different tissues of the sea anemone Oulactis muscosa. We describe the cnidom, providing new qualitative and quantitative data. Our description adds spirocysts for tentacles and acrorhagi, and is more precise about the ranges and types of basitrichs, microbasic b-mastigophores, and holotrichs. We distinguish two types of holotrichs in the acrorhagi, and differentiate between microbasic b-mastigophores and basitrichs in the actinopharynx and mesenterial filaments. A relationship between cnida length and body weight was not demonstrated. The results are based on a complete account of cnida types from all tissues, and considering the great number of capsules measured (5400) and the modern statistical tools employed, we think that a normal distribution of cnida lengths is uncommon, perhaps refuted. This finding is very important when a quantitative analysis of cnidae is necessary and an adequate statistical tool must be used. We have shown that generalized linear models are an alternative and therefore analyses can be done with parametric methods despite the non-normal distribution of cnida size. The use of these statistical tools should be generalized since appropriate package for analyses (like the R package) are available from the web and the obtained results are robust and powerful.

Author(s):  
Felipe De Mendiburu ◽  
Reinhard Simon

Plant breeders and educators working with the International Potato Center (CIP) needed freely available statistical tools. In response, we created first a set of scripts for specific tasks using the open source statistical software R. Based on this we eventually compiled the R package agricolae as it covered a niche. Here we describe for the first time its main functions in the form of an article. We also review its reception using download statistics, citation data, and feedback from a user survey. We highlight usage in our extended network of collaborators. The package has found applications beyond agriculture in fields like aquaculture, ecology, biodiversity, conservation biology and cancer research. In summary, the package agricolae is a well established statistical toolbox based on R with a broad range of applications in design and analyses of experiments also in the wider biological community .


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nasim Bararpour ◽  
Federica Gilardi ◽  
Cristian Carmeli ◽  
Jonathan Sidibe ◽  
Julijana Ivanisevic ◽  
...  

AbstractAs a powerful phenotyping technology, metabolomics provides new opportunities in biomarker discovery through metabolome-wide association studies (MWAS) and the identification of metabolites having a regulatory effect in various biological processes. While mass spectrometry-based (MS) metabolomics assays are endowed with high throughput and sensitivity, MWAS are doomed to long-term data acquisition generating an overtime-analytical signal drift that can hinder the uncovering of real biologically relevant changes. We developed “dbnorm”, a package in the R environment, which allows for an easy comparison of the model performance of advanced statistical tools commonly used in metabolomics to remove batch effects from large metabolomics datasets. “dbnorm” integrates advanced statistical tools to inspect the dataset structure not only at the macroscopic (sample batches) scale, but also at the microscopic (metabolic features) level. To compare the model performance on data correction, “dbnorm” assigns a score that help users identify the best fitting model for each dataset. In this study, we applied “dbnorm” to two large-scale metabolomics datasets as a proof of concept. We demonstrate that “dbnorm” allows for the accurate selection of the most appropriate statistical tool to efficiently remove the overtime signal drift and to focus on the relevant biological components of complex datasets.


Author(s):  
Felipe De Mendiburu ◽  
Reinhard Simon

Plant breeders and educators working with the International Potato Center (CIP) needed freely available statistical tools. In response, we created first a set of scripts for specific tasks using the open source statistical software R. Based on this we eventually compiled the R package agricolae as it covered a niche. Here we describe for the first time its main functions in the form of an article. We also review its reception using download statistics, citation data, and feedback from a user survey. We highlight usage in our extended network of collaborators. The package has found applications beyond agriculture in fields like aquaculture, ecology, biodiversity, conservation biology and cancer research. In summary, the package agricolae is a well established statistical toolbox based on R with a broad range of applications in design and analyses of experiments also in the wider biological community .


2004 ◽  
Vol 313 (1) ◽  
pp. 63-73 ◽  
Author(s):  
O. Chomsky ◽  
Y. Kamenir ◽  
M. Hyams ◽  
Z. Dubinsky ◽  
N.E. Chadwick-Furman

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e10849
Author(s):  
Maximilian Knoll ◽  
Jennifer Furkel ◽  
Juergen Debus ◽  
Amir Abdollahi

Background Model building is a crucial part of omics based biomedical research to transfer classifications and obtain insights into underlying mechanisms. Feature selection is often based on minimizing error between model predictions and given classification (maximizing accuracy). Human ratings/classifications, however, might be error prone, with discordance rates between experts of 5–15%. We therefore evaluate if a feature pre-filtering step might improve identification of features associated with true underlying groups. Methods Data was simulated for up to 100 samples and up to 10,000 features, 10% of which were associated with the ground truth comprising 2–10 normally distributed populations. Binary and semi-quantitative ratings with varying error probabilities were used as classification. For feature preselection standard cross-validation (V2) was compared to a novel heuristic (V1) applying univariate testing, multiplicity adjustment and cross-validation on switched dependent (classification) and independent (features) variables. Preselected features were used to train logistic regression/linear models (backward selection, AIC). Predictions were compared against the ground truth (ROC, multiclass-ROC). As use case, multiple feature selection/classification methods were benchmarked against the novel heuristic to identify prognostically different G-CIMP negative glioblastoma tumors from the TCGA-GBM 450 k methylation array data cohort, starting from a fuzzy umap based rough and erroneous separation. Results V1 yielded higher median AUC ranks for two true groups (ground truth), with smaller differences for true graduated differences (3–10 groups). Lower fractions of models were successfully fit with V1. Median AUCs for binary classification and two true groups were 0.91 (range: 0.54–1.00) for V1 (Benjamini-Hochberg) and 0.70 (0.28–1.00) for V2, 13% (n = 616) of V2 models showed AUCs < = 50% for 25 samples and 100 features. For larger numbers of features and samples, median AUCs were 0.75 (range 0.59–1.00) for V1 and 0.54 (range 0.32–0.75) for V2. In the TCGA-GBM data, modelBuildR allowed best prognostic separation of patients with highest median overall survival difference (7.51 months) followed a difference of 6.04 months for a random forest based method. Conclusions The proposed heuristic is beneficial for the retrieval of features associated with two true groups classified with errors. We provide the R package modelBuildR to simplify (comparative) evaluation/application of the proposed heuristic (http://github.com/mknoll/modelBuildR).


Author(s):  
Zachary D. Kurtz ◽  
Richard Bonneau ◽  
Christian L. Müller

AbstractDetecting community-wide statistical relationships from targeted amplicon-based and metagenomic profiling of microbes in their natural environment is an important step toward understanding the organization and function of these communities. We present a robust and computationally tractable latent graphical model inference scheme that allows simultaneous identification of parsimonious statistical relationships among microbial species and unobserved factors that influence the prevalence and variability of the abundance measurements. Our method comes with theoretical performance guarantees and is available within the SParse InversE Covariance estimation for Ecological ASsociation Inference (SPIEC-EASI) framework (‘SpiecEasi’ R-package). Using simulations, as well as a comprehensive collection of amplicon-based gut microbiome datasets, we illustrate the method’s ability to jointly identify compositional biases, latent factors that correlate with observed technical covariates, and robust statistical microbial associations that replicate across different gut microbial data sets.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Lorentz Jäntschi ◽  
Donatella Bálint ◽  
Sorana D. Bolboacă

Multiple linear regression analysis is widely used to link an outcome with predictors for better understanding of the behaviour of the outcome of interest. Usually, under the assumption that the errors follow a normal distribution, the coefficients of the model are estimated by minimizing the sum of squared deviations. A new approach based on maximum likelihood estimation is proposed for finding the coefficients on linear models with two predictors without any constrictive assumptions on the distribution of the errors. The algorithm was developed, implemented, and tested as proof-of-concept using fourteen sets of compounds by investigating the link between activity/property (as outcome) and structural feature information incorporated by molecular descriptors (as predictors). The results on real data demonstrated that in all investigated cases the power of the error is significantly different by the convenient value of two when the Gauss-Laplace distribution was used to relax the constrictive assumption of the normal distribution of the error. Therefore, the Gauss-Laplace distribution of the error could not be rejected while the hypothesis that the power of the error from Gauss-Laplace distribution is normal distributed also failed to be rejected.


2002 ◽  
Vol 1 (2) ◽  
pp. 48 ◽  
Author(s):  
Paulo Jorge Sanches Barbeira

Histograms and normal distribution curves (Gaussian fit) were used to determine the average composition of the gasoline commercialized in the state of Minas Gerais, Brazil, and to detect samples with atypical composition. Atypical composition may rise from the fact that the gasoline comes from distinct sources (refinery or petrochemical plant) or may be due to the careful addition of industrial solvents. A considerable number of samples of atypical composition have been detected despite the samples tested having the same source. This statistical analytical tool has been shown useful in the evaluation of fuel quality.


2015 ◽  
Vol 4 (1) ◽  
Author(s):  
Johan Zetterqvist ◽  
Arvid Sjölander

AbstractA common goal of epidemiologic research is to study the association between a certain exposure and a certain outcome, while controlling for important covariates. This is often done by fitting a restricted mean model for the outcome, as in generalized linear models (GLMs) and in generalized estimating equations (GEEs). If the covariates are high-dimensional, then it may be difficult to well specify the model. This is an important concern, since model misspecification may lead to biased estimates. Doubly robust estimation is an estimation technique that offers some protection against model misspecification. It utilizes two models, one for the outcome and one for the exposure, and produces unbiased estimates of the exposure-outcome association if either model is correct, not necessarily both. Despite its obvious appeal, doubly robust estimation is not used on a regular basis in applied epidemiologic research. One reason for this could be the lack of up-to-date software. In this paper we describe a new


Sign in / Sign up

Export Citation Format

Share Document