scholarly journals Sequential Sampling for Optimal Bayesian Classification of Sequencing Count Data

Author(s):  
Ariana Broumand ◽  
Siamak Zamani Dadaneh
Plant Disease ◽  
2007 ◽  
Vol 91 (8) ◽  
pp. 1013-1020 ◽  
Author(s):  
David H. Gent ◽  
William W. Turechek ◽  
Walter F. Mahaffee

Sequential sampling models for estimation and classification of the incidence of powdery mildew (caused by Podosphaera macularis) on hop (Humulus lupulus) cones were developed using parameter estimates of the binary power law derived from the analysis of 221 transect data sets (model construction data set) collected from 41 hop yards sampled in Oregon and Washington from 2000 to 2005. Stop lines, models that determine when sufficient information has been collected to estimate mean disease incidence and stop sampling, for sequential estimation were validated by bootstrap simulation using a subset of 21 model construction data sets and simulated sampling of an additional 13 model construction data sets. Achieved coefficient of variation (C) approached the prespecified C as the estimated disease incidence, [Formula: see text], increased, although achieving a C of 0.1 was not possible for data sets in which [Formula: see text] < 0.03 with the number of sampling units evaluated in this study. The 95% confidence interval of the median difference between [Formula: see text] of each yard (achieved by sequential sampling) and the true p of the original data set included 0 for all 21 data sets evaluated at levels of C of 0.1 and 0.2. For sequential classification, operating characteristic (OC) and average sample number (ASN) curves of the sequential sampling plans obtained by bootstrap analysis and simulated sampling were similar to the OC and ASN values determined by Monte Carlo simulation. Correct decisions of whether disease incidence was above or below prespecified thresholds (pt) were made for 84.6 or 100% of the data sets during simulated sampling when stop lines were determined assuming a binomial or beta-binomial distribution of disease incidence, respectively. However, the higher proportion of correct decisions obtained by assuming a beta-binomial distribution of disease incidence required, on average, sampling 3.9 more plants per sampling round to classify disease incidence compared with the binomial distribution. Use of these sequential sampling plans may aid growers in deciding the order in which to harvest hop yards to minimize the risk of a condition called “cone early maturity” caused by late-season infection of cones by P. macularis. Also, sequential sampling could aid in research efforts, such as efficacy trials, where many hop cones are assessed to determine disease incidence.


2007 ◽  
Vol 18 (4) ◽  
pp. 605-612 ◽  
Author(s):  
Jan‐Philip M. Witte ◽  
Rafał B. Wójcik ◽  
Paul J.J.F. Torfs ◽  
Martin W.H. Haan ◽  
Stephan Hennekens

Tellus B ◽  
2018 ◽  
Vol 70 (1) ◽  
pp. 1-10 ◽  
Author(s):  
M. A. Zaidan ◽  
V. Haapasilta ◽  
R. Relan ◽  
H. Junninen ◽  
P. P. Aalto ◽  
...  

2016 ◽  
Vol 55 (4) ◽  
pp. 1425-1438 ◽  
Author(s):  
Prashant Singh ◽  
Joachim van der Herten ◽  
Dirk Deschrijver ◽  
Ivo Couckuyt ◽  
Tom Dhaene

2007 ◽  
Vol 38 (9) ◽  
pp. 52-62 ◽  
Author(s):  
Humikazu Mitomi ◽  
Fuyuki Fujiwara ◽  
Masanobu Yamamoto ◽  
Taisuke Sato

2019 ◽  
Vol 13 ◽  
pp. 117793221986081 ◽  
Author(s):  
Takayuki Osabe ◽  
Kentaro Shimizu ◽  
Koji Kadota

Empirical Bayes is a choice framework for differential expression (DE) analysis for multi-group RNA-seq count data. Its characteristic ability to compute posterior probabilities for predefined expression patterns allows users to assign the pattern with the highest value to the gene under consideration. However, current Bayesian methods such as baySeq and EBSeq can be improved, especially with respect to normalization. Two R packages (baySeq and EBSeq) with their default normalization settings and with other normalization methods (MRN and TCC) were compared using three-group simulation data and real count data. Our findings were as follows: (1) the Bayesian methods coupled with TCC normalization performed comparably or better than those with the default normalization settings under various simulation scenarios, (2) default DE pipelines provided in TCC that implements a generalized linear model framework was still superior to the Bayesian methods with TCC normalization when overall degree of DE was evaluated, and (3) baySeq with TCC was robust against different choices of possible expression patterns. In practice, we recommend using the default DE pipeline provided in TCC for obtaining overall gene ranking and then using the baySeq with TCC normalization for assigning the most plausible expression patterns to individual genes.


IEEE Expert ◽  
1992 ◽  
Vol 7 (4) ◽  
pp. 67-75 ◽  
Author(s):  
L. Hunter ◽  
D.J. States

Sign in / Sign up

Export Citation Format

Share Document