Comparison study in statistical estimation of gene effects based on a real data set

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.

Download Full-text

Evaluation for estimating of the PDF and the CDF of Generalized Inverted Exponential Distribution with Application in Industry

Advances in Mathematics: Scientific Journal ◽

10.37418/amsj.9.1.39 ◽

2020 ◽

pp. 507-522

Author(s):

Parisa Torkaman

Keyword(s):

Least Squares ◽

Exponential Distribution ◽

Mean Squared Error ◽

Weighted Least Squares ◽

Real Data ◽

Minimum Variance ◽

Cumulative Distribution ◽

Estimation Methods ◽

Data Set ◽

Better Than

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.

Download Full-text

HCVS: Pinpointing Chromatin States Through Hierarchical Clustering and Visualization Scheme

Current Bioinformatics ◽

10.2174/1574893613666180402141107 ◽

2019 ◽

Vol 14 (2) ◽

pp. 148-156

Author(s):

Nighat Noureen ◽

Sahar Fazal ◽

Muhammad Abdul Qadir ◽

Muhammad Tanvir Afzal

Keyword(s):

Hierarchical Clustering ◽

Real Data ◽

Cell Types ◽

Computational Scheme ◽

Data Set ◽

Chromatin States ◽

Functional Regions ◽

Visualization Strategy ◽

Hidden States ◽

Next Generation Sequencing Ngs

Background: Specific combinations of Histone Modifications (HMs) contributing towards histone code hypothesis lead to various biological functions. HMs combinations have been utilized by various studies to divide the genome into different regions. These study regions have been classified as chromatin states. Mostly Hidden Markov Model (HMM) based techniques have been utilized for this purpose. In case of chromatin studies, data from Next Generation Sequencing (NGS) platforms is being used. Chromatin states based on histone modification combinatorics are annotated by mapping them to functional regions of the genome. The number of states being predicted so far by the HMM tools have been justified biologically till now. Objective: The present study aimed at providing a computational scheme to identify the underlying hidden states in the data under consideration. </P><P> Methods: We proposed a computational scheme HCVS based on hierarchical clustering and visualization strategy in order to achieve the objective of study. Results: We tested our proposed scheme on a real data set of nine cell types comprising of nine chromatin marks. The approach successfully identified the state numbers for various possibilities. The results have been compared with one of the existing models as well which showed quite good correlation. Conclusion: The HCVS model not only helps in deciding the optimal state numbers for a particular data but it also justifies the results biologically thereby correlating the computational and biological aspects.

Download Full-text

Implementation of a Modified Faster R-CNN for Target Detection Technology of Coastal Defense Radar

Remote Sensing ◽

10.3390/rs13091703 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1703

Author(s):

He Yan ◽

Chao Chen ◽

Guodong Jin ◽

Jindong Zhang ◽

Xudong Wang ◽

...

Keyword(s):

False Alarm ◽

False Alarm Rate ◽

Target Detection ◽

Real Data ◽

Detection Performance ◽

Detection Accuracy ◽

Constant False Alarm Rate ◽

Data Set ◽

Detection Technology ◽

Coastal Defense

The traditional method of constant false-alarm rate detection is based on the assumption of an echo statistical model. The target recognition accuracy rate and the high false-alarm rate under the background of sea clutter and other interferences are very low. Therefore, computer vision technology is widely discussed to improve the detection performance. However, the majority of studies have focused on the synthetic aperture radar because of its high resolution. For the defense radar, the detection performance is not satisfactory because of its low resolution. To this end, we herein propose a novel target detection method for the coastal defense radar based on faster region-based convolutional neural network (Faster R-CNN). The main processing steps are as follows: (1) the Faster R-CNN is selected as the sea-surface target detector because of its high target detection accuracy; (2) a modified Faster R-CNN based on the characteristics of sparsity and small target size in the data set is employed; and (3) soft non-maximum suppression is exploited to eliminate the possible overlapped detection boxes. Furthermore, detailed comparative experiments based on a real data set of coastal defense radar are performed. The mean average precision of the proposed method is improved by 10.86% compared with that of the original Faster R-CNN.

Download Full-text

Intuitionistic fuzzy two-factor variance analysis of movie ticket sales

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219212 ◽

2021 ◽

pp. 1-11

Author(s):

Velichka Traneva ◽

Stoyan Tranev

Keyword(s):

Comparative Analysis ◽

Real Data ◽

Intuitionistic Fuzzy Sets ◽

Real Numbers ◽

Data Set ◽

Important Method ◽

Ticket Sales ◽

Intuitionistic Fuzzy ◽

The One ◽

First Time

Analysis of variance (ANOVA) is an important method in data analysis, which was developed by Fisher. There are situations when there is impreciseness in data In order to analyze such data, the aim of this paper is to introduce for the first time an intuitionistic fuzzy two-factor ANOVA (2-D IFANOVA) without replication as an extension of the classical ANOVA and the one-way IFANOVA for a case where the data are intuitionistic fuzzy rather than real numbers. The proposed approach employs the apparatus of intuitionistic fuzzy sets (IFSs) and index matrices (IMs). The paper also analyzes a unique set of data on daily ticket sales for a year in a multiplex of Cinema City Bulgaria, part of Cineworld PLC Group, applying the two-factor ANOVA and the proposed 2-D IFANOVA to study the influence of “ season ” and “ ticket price ” factors. A comparative analysis of the results, obtained after the application of ANOVA and 2-D IFANOVA over the real data set, is also presented.

Download Full-text

A Comparison Study of Mahalanobis-Taguchi System and Neural Network for Multivariate Pattern Recognition

Design Engineering, Parts A and B ◽

10.1115/imece2005-80029 ◽

2005 ◽

Cited By ~ 10

Author(s):

Jungeui Hong ◽

Elizabeth A. Cudney ◽

Genichi Taguchi ◽

Rajesh Jugulum ◽

Kioumars Paryani ◽

...

Keyword(s):

Neural Network ◽

Small Data ◽

Data Sets ◽

Comparison Study ◽

Data Set ◽

Set Size ◽

Breast Cancer Study ◽

Discriminant Ability ◽

Small Data Sets ◽

Multivariate Pattern

The Mahalanobis-Taguchi System is a diagnosis and predictive method for analyzing patterns in multivariate cases. The goal of this study is to compare the ability of the Mahalanobis-Taguchi System and a neural network to discriminate using small data sets. We examine the discriminant ability as a function of data set size using an application area where reliable data is publicly available. The study uses the Wisconsin Breast Cancer study with nine attributes and one class.

Download Full-text

Current cancer driver variant predictors learn to recognize driver genes instead of functional variants

BMC Biology ◽

10.1186/s12915-020-00930-0 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Daniele Raimondi ◽

Antoine Passemiers ◽

Piero Fariselli ◽

Yves Moreau

Keyword(s):

Data Sets ◽

Precision Oncology ◽

Complex Task ◽

Excellent Performance ◽

Driver Genes ◽

Significant Drop ◽

Data Set ◽

Gene Effects ◽

Cancer Driver ◽

Functional Variants

Abstract Background Identifying variants that drive tumor progression (driver variants) and distinguishing these from variants that are a byproduct of the uncontrolled cell growth in cancer (passenger variants) is a crucial step for understanding tumorigenesis and precision oncology. Various bioinformatics methods have attempted to solve this complex task. Results In this study, we investigate the assumptions on which these methods are based, showing that the different definitions of driver and passenger variants influence the difficulty of the prediction task. More importantly, we prove that the data sets have a construction bias which prevents the machine learning (ML) methods to actually learn variant-level functional effects, despite their excellent performance. This effect results from the fact that in these data sets, the driver variants map to a few driver genes, while the passenger variants spread across thousands of genes, and thus just learning to recognize driver genes provides almost perfect predictions. Conclusions To mitigate this issue, we propose a novel data set that minimizes this bias by ensuring that all genes covered by the data contain both driver and passenger variants. As a result, we show that the tested predictors experience a significant drop in performance, which should not be considered as poorer modeling, but rather as correcting unwarranted optimism. Finally, we propose a weighting procedure to completely eliminate the gene effects on such predictions, thus precisely evaluating the ability of predictors to model the functional effects of single variants, and we show that indeed this task is still open.

Download Full-text

A Rank-Based Nonparametric Method for Mapping Quantitative Trait Loci in Outbred Half-Sib Pedigrees: Application to Milk Production in a Granddaughter Design

Genetics ◽

10.1093/genetics/149.3.1547 ◽

1998 ◽

Vol 149 (3) ◽

pp. 1547-1555 ◽

Cited By ~ 2

Author(s):

Wouter Coppieters ◽

Alexandre Kvasz ◽

Frédéric Farnir ◽

Juan-Jose Arranz ◽

Bernard Grisart ◽

...

Keyword(s):

Quantitative Trait Loci ◽

Quantitative Trait ◽

Real Data ◽

Interval Mapping ◽

Mapping Method ◽

Residual Variance ◽

Data Set ◽

Wilcoxon Rank Sum Test ◽

Holstein Friesian ◽

Trait Loci

Abstract We describe the development of a multipoint nonparametric quantitative trait loci mapping method based on the Wilcoxon rank-sum test applicable to outbred half-sib pedigrees. The method has been evaluated on a simulated dataset and its efficiency compared with interval mapping by using regression. It was shown that the rank-based approach is slightly inferior to regression when the residual variance is homoscedastic normal; however, in three out of four other scenarios envisaged, i.e., residual variance heteroscedastic normal, homoscedastic skewed, and homoscedastic positively kurtosed, the latter outperforms the former one. Both methods were applied to a real data set analyzing the effect of bovine chromosome 6 on milk yield and composition by using a 125-cM map comprising 15 microsatellites and a granddaughter design counting 1158 Holstein-Friesian sires.

Download Full-text

Bayesian and E-Bayesian Estimations of Bathtub-Shaped Distribution under Generalized Type-I Hybrid Censoring

Entropy ◽

10.3390/e23080934 ◽

2021 ◽

Vol 23 (8) ◽

pp. 934

Author(s):

Yuxuan Zhang ◽

Kaiwei Liu ◽

Wenhao Gui

Keyword(s):

Bayesian Method ◽

Real Data ◽

Loss Functions ◽

Type I ◽

Statistical Efficiency ◽

Unknown Parameters ◽

Data Set ◽

Hybrid Censoring ◽

Bayesian Estimations ◽

Generalized Type

For the purpose of improving the statistical efficiency of estimators in life-testing experiments, generalized Type-I hybrid censoring has lately been implemented by guaranteeing that experiments only terminate after a certain number of failures appear. With the wide applications of bathtub-shaped distribution in engineering areas and the recently introduced generalized Type-I hybrid censoring scheme, considering that there is no work coalescing this certain type of censoring model with a bathtub-shaped distribution, we consider the parameter inference under generalized Type-I hybrid censoring. First, estimations of the unknown scale parameter and the reliability function are obtained under the Bayesian method based on LINEX and squared error loss functions with a conjugate gamma prior. The comparison of estimations under the E-Bayesian method for different prior distributions and loss functions is analyzed. Additionally, Bayesian and E-Bayesian estimations with two unknown parameters are introduced. Furthermore, to verify the robustness of the estimations above, the Monte Carlo method is introduced for the simulation study. Finally, the application of the discussed inference in practice is illustrated by analyzing a real data set.

Download Full-text

On the Estimation of Reliability of Weighted Weibull Distribution: A Comparative Study

International Journal of Statistics and Probability ◽

10.5539/ijsp.v5n4p1 ◽

2016 ◽

Vol 5 (4) ◽

pp. 1

Author(s):

Bander Al-Zahrani

Keyword(s):

Maximum Likelihood ◽

Weibull Distribution ◽

Real Data ◽

Reliability Function ◽

Nonparametric Methods ◽

Empirical Method ◽

Density Estimator ◽

Bayes Estimators ◽

Unknown Parameters ◽

Data Set

The paper gives a description of estimation for the reliability function of weighted Weibull distribution. The maximum likelihood estimators for the unknown parameters are obtained. Nonparametric methods such as empirical method, kernel density estimator and a modified shrinkage estimator are provided. The Markov chain Monte Carlo method is used to compute the Bayes estimators assuming gamma and Jeffrey priors. The performance of the maximum likelihood, nonparametric methods and Bayesian estimators is assessed through a real data set.

Download Full-text