Power Calculator for Detecting Allelic Imbalance Using Hierarchical Bayesian Model

Allelic imbalance (AI) is the differential expression of the two alleles in a diploid. AI can vary between tissues, treatments, and environments. Statistical methods for testing in this area exist, with impacts of explosive type I error in the presence of bias well understood. However, for study design, the more important and understudied problem is the type II error and power. As the biological questions for this type of study explode, and the costs of the technology plummet, what is more important: reads or replicates? How small of an interaction can be detected while keeping the type I error at bay? Here we present a simulation study that demonstrates that the proper model can control type I error below 5% for most scenarios. We find that a minimum of 2400, 480, and 240 allele specific reads divided equally among 12, 5, and 3 replicates is needed to detect a 10%, 20%, and 30%, respectively, deviation from allelic balance in a condition with power >80%. A minimum of 960 and 240 allele specific reads is needed to detect a 20% or 30% difference in AI between conditions with comparable power but these reads need to be divided amongst 8 replicates. Higher numbers of replicates increase power more than adding coverage without affecting type I error. We provide a Python package that enables simulation of AI scenarios and enables individuals to estimate type I error and power in detecting AI and differences in AI between conditions tailored to their own specific study needs.

Download Full-text

Power calculator for detecting allelic imbalance using hierarchical Bayesian model

BMC Research Notes ◽

10.1186/s13104-021-05851-x ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Katrina Sherbina ◽

Luis G. León-Novelo ◽

Sergey V. Nuzhdin ◽

Lauren M. McIntyre ◽

Fabio Marroni

Keyword(s):

Differential Expression ◽

Bayesian Model ◽

Allelic Imbalance ◽

Type I Error ◽

Type I ◽

Hierarchical Bayesian ◽

Hierarchical Bayesian Model ◽

Allele Specific ◽

Python Package

Abstract Objective Allelic imbalance (AI) is the differential expression of the two alleles in a diploid. AI can vary between tissues, treatments, and environments. Methods for testing AI exist, but methods are needed to estimate type I error and power for detecting AI and difference of AI between conditions. As the costs of the technology plummet, what is more important: reads or replicates? Results We find that a minimum of 2400, 480, and 240 allele specific reads divided equally among 12, 5, and 3 replicates is needed to detect a 10, 20, and 30%, respectively, deviation from allelic balance in a condition with power > 80%. A minimum of 960 and 240 allele specific reads divided equally among 8 replicates is needed to detect a 20 or 30% difference in AI between conditions with comparable power. Higher numbers of replicates increase power more than adding coverage without affecting type I error. We provide a Python package that enables simulation of AI scenarios and enables individuals to estimate type I error and power in detecting AI and differences in AI between conditions.

Download Full-text

Type-I Error and Type-II Error and Thirukkural

SSRN Electronic Journal ◽

10.2139/ssrn.1334661 ◽

2008 ◽

Cited By ~ 1

Author(s):

Chendrayan Chendroyaperumal

Keyword(s):

Type I Error ◽

Type I ◽

Type Ii ◽

Type Ii Error

Download Full-text

BANKRUPTCY PREDICTION MODEL WITH ZETAc OPTIMAL CUT-OFF SCORE TO CORRECT TYPE I ERRORS

Gadjah Mada International Journal of Business ◽

10.22146/gamaijb.5563 ◽

2005 ◽

Vol 7 (1) ◽

pp. 41 ◽

Cited By ~ 1

Author(s):

Mohamad Iwan

Keyword(s):

Type I Error ◽

Bankruptcy Prediction ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Prior Probabilities ◽

Prediction Result ◽

Optimum Cutting ◽

Classification Errors ◽

One Year

This research examines financial ratios that distinguish between bankrupt and non-bankrupt companies and make use of those distinguishing ratios to build a one-year prior to bankruptcy prediction model. This research also calculates how many times the type I error is more costly compared to the type II error. The costs of type I and type II errors (cost of misclassification errors) in conjunction to the calculation of prior probabilities of bankruptcy and non-bankruptcy are used in the calculation of the ZETAc optimal cut-off score. The bankruptcy prediction result using ZETAc optimal cut-off score is compared to the bankruptcy prediction result using a cut-off score which does not consider neither cost of classification errors nor prior probabilities as stated by Hair et al. (1998), and for later purposes will be referred to Hair et al. optimum cutting score. Comparison between the prediction results of both cut-off scores is purported to determine the better cut-off score between the two, so that the prediction result is more conservative and minimizes expected costs, which may occur from classification errors. This is the first research in Indonesia that incorporates type I and II errors and prior probabilities of bankruptcy and non-bankruptcy in the computation of the cut-off score used in performing bankruptcy prediction. Earlier researches gave the same weight between type I and II errors and prior probabilities of bankruptcy and non-bankruptcy, while this research gives a greater weigh on type I error than that on type II error and prior probability of non-bankruptcy than that on prior probability of bankruptcy.This research has successfully attained the following results: (1) type I error is in fact 59,83 times more costly compared to type II error, (2) 22 ratios distinguish between bankrupt and non-bankrupt groups, (3) 2 financial ratios proved to be effective in predicting bankruptcy, (4) prediction using ZETAc optimal cut-off score predicts more companies filing for bankruptcy within one year compared to prediction using Hair et al. optimum cutting score, (5) Although prediction using Hair et al. optimum cutting score is more accurate, prediction using ZETAc optimal cut-off score proved to be able to minimize cost incurred from classification errors.

Download Full-text

Comparing Four Methods for Estimating Tree-Based Treatment Regimes

The International Journal of Biostatistics ◽

10.1515/ijb-2016-0068 ◽

2017 ◽

Vol 13 (1) ◽

Cited By ~ 6

Author(s):

Aniek Sies ◽

Iven Van Mechelen

Keyword(s):

Simulation Study ◽

Optimal Treatment ◽

Treatment Regime ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Treatment Alternative ◽

Treatment Regimes ◽

Qualitative Interaction ◽

Error Probabilities

AbstractWhen multiple treatment alternatives are available for a certain psychological or medical problem, an important challenge is to find an optimal treatment regime, which specifies for each patient the most effective treatment alternative given his or her pattern of pretreatment characteristics. The focus of this paper is on tree-based treatment regimes, which link an optimal treatment alternative to each leaf of a tree; as such they provide an insightful representation of the decision structure underlying the regime. This paper compares the absolute and relative performance of four methods for estimating regimes of that sort (viz., Interaction Trees, Model-based Recursive Partitioning, an approach developed by Zhang et al. and Qualitative Interaction Trees) in an extensive simulation study. The evaluation criteria were, on the one hand, the expected outcome if the entire population would be subjected to the treatment regime resulting from each method under study and the proportion of clients assigned to the truly best treatment alternative, and, on the other hand, the Type I and Type II error probabilities of each method. The method of Zhang et al. was superior regarding the first two outcome measures and the Type II error probabilities, but performed worst in some conditions of the simulation study regarding Type I error probabilities.

Download Full-text

Is It Really Robust?

Methodology ◽

10.1027/1614-2241/a000016 ◽

2010 ◽

Vol 6 (4) ◽

pp. 147-151 ◽

Cited By ~ 385

Author(s):

Emanuel Schmider ◽

Matthias Ziegler ◽

Erik Danay ◽

Luzi Beyer ◽

Markus Bühner

Keyword(s):

Goodness Of Fit ◽

Type I Error ◽

Effect Sizes ◽

Random Numbers ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Normality Assumption ◽

Different Types ◽

Factor Type

Empirical evidence to the robustness of the analysis of variance (ANOVA) concerning violation of the normality assumption is presented by means of Monte Carlo methods. High-quality samples underlying normally, rectangularly, and exponentially distributed basic populations are created by drawing samples which consist of random numbers from respective generators, checking their goodness of fit, and allowing only the best 10% to take part in the investigation. A one-way fixed-effect design with three groups of 25 values each is chosen. Effect-sizes are implemented in the samples and varied over a broad range. Comparing the outcomes of the ANOVA calculations for the different types of distributions, gives reason to regard the ANOVA as robust. Both, the empirical type I error α and the empirical type II error β remain constant under violation. Moreover, regression analysis identifies the factor “type of distribution” as not significant in explanation of the ANOVA results.

Download Full-text

When Studies are in Error: Basic Statistical Vocabulary Needed to Understand Clinical Studies

Journal of Cutaneous Medicine and Surgery ◽

10.1177/120347549600100108 ◽

1996 ◽

Vol 1 (1) ◽

pp. 25-28 ◽

Cited By ~ 1

Author(s):

Martin A. Weinstock

Keyword(s):

Null Hypothesis ◽

Statistical Power ◽

Critical Appraisal ◽

Type I Error ◽

Statistical Significance ◽

P Value ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Error Type

Background: Accurate understanding of certain basic statistical terms and principles is key to critical appraisal of published literature. Objective: This review describes type I error, type II error, null hypothesis, p value, statistical significance, a, two-tailed and one-tailed tests, effect size, alternate hypothesis, statistical power, β, publication bias, confidence interval, standard error, and standard deviation, while including examples from reports of dermatologic studies. Conclusion: The application of the results of published studies to individual patients should be informed by an understanding of certain basic statistical concepts.

Download Full-text

Automatic Real-Time Identification of Fingerprint Images Using Wavelets and Gradient of Gaussian

Journal of Circuits System and Computers ◽

10.1142/s0218126697000322 ◽

1997 ◽

Vol 07 (05) ◽

pp. 433-440 ◽

Cited By ~ 3

Author(s):

Woo Kyu Lee ◽

Jae Ho Chung

Keyword(s):

Wavelet Transform ◽

Real Time ◽

Type I Error ◽

Recognition Algorithm ◽

Fingerprint Recognition ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Local Orientation ◽

Real Time Identification

In this paper, a fingerprint recognition algorithm is suggested. The algorithm is developed based on the wavelet transform, and the dominant local orientation which is derived from the coherence and the gradient of Gaussian. By using the wavelet transform, the algorithm does not require conventional preprocessing procedures such as smoothing, binarization, thining and restoration. Computer simulation results show that when the rate of Type II error — Incorrect recognition of two different fingerprints as identical fingerprints — is held at 0.0%, the rate of Type I error — Incorrect recognition of two identical fingerprints as different ones — turns out as 2.5% in real time.

Download Full-text

A New and Simpler Approximation for ANOVA Under Variance Heterogeneity

Journal of Educational Statistics ◽

10.3102/10769986019002091 ◽

1994 ◽

Vol 19 (2) ◽

pp. 91-101 ◽

Cited By ~ 28

Author(s):

Ralph A. Alexander ◽

Diane M. Govern

Keyword(s):

Type I Error ◽

Order Approximation ◽

Error Rates ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Tail Probabilities ◽

Type I Error Rates ◽

Heterogeneity Of Variance ◽

The Face

A new approximation is proposed for testing the equality of k independent means in the face of heterogeneity of variance. Monte Carlo simulations show that the new procedure has Type I error rates that are very nearly nominal and Type II error rates that are quite close to those produced by James’s (1951) second-order approximation. In addition, it is computationally the simplest approximation yet to appear, and it is easily applied to Scheffé (1959) -type multiple contrasts and to the calculation of approximate tail probabilities.

Download Full-text

Symmetry's Mandate: Constraining the Politicization of American Administrative Law

Michigan Law Review ◽

10.36644/mlr.119.3.symmetry ◽

2020 ◽

pp. 455

Author(s):

Daniel Walters

Keyword(s):

Type I Error ◽

Theoretical Perspective ◽

Administrative Law ◽

Political Conflict ◽

Regulatory Action ◽

Positive Case ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Call Type

Recent years have seen the rise of pointed and influential critiques of deference doctrines in administrative law. What many of these critiques have in common is a view that judges, not agencies, should resolve interpretive disputes over the meaning of statutes—disputes the critics take to be purely legal and almost always resolvable using lawyerly tools of statutory construction. In this Article, I take these critiques, and the relatively formalist assumptions behind them, seriously and show that the critics have not acknowledged or advocated the full reform vision implied by their theoretical premises. Specifically, critics have extended their critique of judicial abdication only to what I call Type I statutory errors (that is, agency interpretations that regulate more conduct than the best reading of the statute would allow the agency to regulate) and do not appear to accept or anticipate that their theory of interpretation would also extend to what I call Type II statutory errors (that is, agency failures to regulate as much conduct as the best reading of the statute would require). As a consequence, critics have been more than willing to entertain an end to Chevron deference, an administrative law doctrine that is mostly invoked to justify Type I error, but have not shown any interest in adjusting administrative law doctrine to remedy agencies’ commission of Type II error. The result is a vision of administrative law’s future that is precariously slanted against legislative and regulatory action. I critique this asymmetry in administrative law and address potential justifications of systemic asymmetries in the doctrine, such as concern about the remedial implications of addressing Type II error, finding them all wanting from a legal and theoretical perspective. I also lay out the positive case for adhering to symmetry in administrative law doctrine. In a time of deep political conflict over regulation and administration, symmetry plays, or at the very least could play, an important role in depoliticizing administrative law, clarifying what is at stake in debates about the proper level of deference to agency legal interpretations, and disciplining partisan gamesmanship. I suggest that when the conversation is so disciplined, an administrative law without deference to both Type I and Type II error is hard to imagine due to the high judicial costs of minimizing Type II error, but if we collectively choose to discard deference notwithstanding these costs, it would be a more sustainable political choice for administrative law than embracing the current, one-sided critique of deference.

Download Full-text

COMPARISON OF TRADITIONAL AND MACHINE LEARNING BASE METHODS FOR GROUND POINT CLOUD LABELING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w18-141-2019 ◽

2019 ◽

Vol XLII-4/W18 ◽

pp. 141-145

Author(s):

S. M. Ayazi ◽

M. Saadat Seresht

Keyword(s):

Machine Learning ◽

Point Cloud ◽

Type I Error ◽

Total Error ◽

Training Data ◽

Lidar Data ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Digital Photogrammetry

Abstract. Today, a variety of methods have been proposed by researchers to distinguish ground and non-ground points in point cloud data. Most fully automated methods have a common disadvantage which is the lack of proper algorithm response for all areas and levels of the ground, so most of these algorithms have good outcomes in simple landscapes but encounter problems in complex landscapes. Point cloud filtering techniques can be divided into two general rule-based and novel methods. Today, the use of machine learning techniques has improved the results of classification, which has led to significant results, especially when data can be labelled at the presence of training data. In this paper, firstly, altimeter and radiometric features are extracted from the LiDAR data and the point cloud derived from digital photogrammetry. Then, these features are participated in a classification process using SVM learning and random forest methods, and the ground and Non-ground points are classified. The classification results using this method on LiDAR data show a total error of 6.2%, a type I error of 5.4%, and a type II error of 13.2%. The comparison of the proposed method with the results of LASTools software shows a reduction in total error and type I error (while increasing the type II error). This method was also investigated on the dense point cloud obtained from digital photogrammetry and based on this study, the total was 7.2%, the type I error was 6.8%, and the type II error was 10.9%.

Download Full-text