A Quadratic Classification Rule with Equicorrelated Training Vectors for Non Random Samples

2010 ◽  
Vol 40 (2) ◽  
pp. 213-231 ◽  
Author(s):  
Ricardo Leiva ◽  
Anuradha Roy
1976 ◽  
Vol 36 (01) ◽  
pp. 071-077 ◽  
Author(s):  
Daniel E. Whitman ◽  
Mary Ellen Switzer ◽  
Patrick A. McKee

SummaryThe availability of factor VIII concentrates is frequently a limitation in the management of classical hemophilia. Such concentrates are prepared from fresh or fresh-frozen plasma. A significant volume of plasma in the United States becomes “indated”, i. e., in contact with red blood cells for 24 hours at 4°, and is therefore not used to prepare factor VIII concentrates. To evaluate this possible resource, partially purified factor VIII was prepared from random samples of fresh-frozen, indated and outdated plasma. The yield of factor VIII protein and procoagulant activity from indated plasma was about the same as that from fresh-frozen plasma. The yield from outdated plasma was substantially less. After further purification, factor VIII from the three sources gave a single subunit band when reduced and analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis. These results indicate that the approximately 287,000 liters of indated plasma processed annually by the American National Red Cross (ANRC) could be used to prepare factor VIII concentrates of good quality. This resource alone could quadruple the supply of factor VIII available for therapy.


2002 ◽  
Vol 7 (1) ◽  
pp. 31-42
Author(s):  
J. Šaltytė ◽  
K. Dučinskas

The Bayesian classification rule used for the classification of the observations of the (second-order) stationary Gaussian random fields with different means and common factorised covariance matrices is investigated. The influence of the observed data augmentation to the Bayesian risk is examined for three different nonlinear widely applicable spatial correlation models. The explicit expression of the Bayesian risk for the classification of augmented data is derived. Numerical comparison of these models by the variability of Bayesian risk in case of the first-order neighbourhood scheme is performed.


2001 ◽  
Vol 6 (2) ◽  
pp. 15-28 ◽  
Author(s):  
K. Dučinskas ◽  
J. Šaltytė

The problem of classification of the realisation of the stationary univariate Gaussian random field into one of two populations with different means and different factorised covariance matrices is considered. In such a case optimal classification rule in the sense of minimum probability of misclassification is associated with non-linear (quadratic) discriminant function. Unknown means and the covariance matrices of the feature vector components are estimated from spatially correlated training samples using the maximum likelihood approach and assuming spatial correlations to be known. Explicit formula of Bayes error rate and the first-order asymptotic expansion of the expected error rate associated with quadratic plug-in discriminant function are presented. A set of numerical calculations for the spherical spatial correlation function is performed and two different spatial sampling designs are compared.


2021 ◽  
Vol 11 (6) ◽  
pp. 2511
Author(s):  
Julian Hatwell ◽  
Mohamed Medhat Gaber ◽  
R. Muhammad Atif Azad

This research presents Gradient Boosted Tree High Importance Path Snippets (gbt-HIPS), a novel, heuristic method for explaining gradient boosted tree (GBT) classification models by extracting a single classification rule (CR) from the ensemble of decision trees that make up the GBT model. This CR contains the most statistically important boundary values of the input space as antecedent terms. The CR represents a hyper-rectangle of the input space inside which the GBT model is, very reliably, classifying all instances with the same class label as the explanandum instance. In a benchmark test using nine data sets and five competing state-of-the-art methods, gbt-HIPS offered the best trade-off between coverage (0.16–0.75) and precision (0.85–0.98). Unlike competing methods, gbt-HIPS is also demonstrably guarded against under- and over-fitting. A further distinguishing feature of our method is that, unlike much prior work, our explanations also provide counterfactual detail in accordance with widely accepted recommendations for what makes a good explanation.


Science ◽  
2021 ◽  
pp. eabh0635
Author(s):  
James A. Hay ◽  
Lee Kennedy-Shaffer ◽  
Sanjat Kanjilal ◽  
Niall J. Lennon ◽  
Stacey B. Gabriel ◽  
...  

Estimating an epidemic’s trajectory is crucial for developing public health responses to infectious diseases, but case data used for such estimation are confounded by variable testing practices. We show that the population distribution of viral loads observed under random or symptom-based surveillance, in the form of cycle threshold (Ct) values obtained from reverse-transcription quantitative polymerase chain reaction testing, changes during an epidemic. Thus, Ct values from even limited numbers of random samples can provide improved estimates of an epidemic’s trajectory. Combining data from multiple such samples improves the precision and robustness of such estimation. We apply our methods to Ct values from surveillance conducted during the SARS-CoV-2 pandemic in a variety of settings and offer alternative approaches for real-time estimates of epidemic trajectories for outbreak management and response.


Author(s):  
Frank Ecker ◽  
Jennifer Francis ◽  
Per Olsson ◽  
Katherine Schipper

AbstractThis paper investigates how data requirements often encountered in archival accounting research can produce a data-restricted sample that is a non-random selection of observations from the reference sample to which the researcher wishes to generalize results. We illustrate the effects of non-random sampling on results of association tests in a setting with data on one variable of interest for all observations and frequently-missing data on another variable of interest. We develop and validate a resampling approach that uses only observations from the data-restricted sample to construct distribution-matched samples that approximate randomly-drawn samples from the reference sample. Our simulation tests provide evidence that distribution-matched samples yield generalizable results. We demonstrate the effects of non-random sampling in tests of the association between realized returns and five implied cost of equity metrics. In this setting, the reference sample has full information on realized returns, while on average only 16% of reference sample observations have data on cost of equity metrics. Consistent with prior research (e.g., Easton and Monahan The Accounting Review 80, 501–538, 2005), analysis using the unadjusted (non-random) cost of equity sample reveals weak or negative associations between realized returns and cost of equity metrics. In contrast, using distribution-matched samples, we find reliable evidence of the theoretically-predicted positive association. We also conceptually and empirically compare distribution-matching with multiple imputation and selection models, two other approaches to dealing with non-random samples.


Insects ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 279
Author(s):  
Anders Lindström ◽  
Disa Eklöf ◽  
Tobias Lilja

In the lower Dalälven region, floodwater mosquitoes cause recurring problems. The main nuisance species is Aedes (Ochlerotatus) sticticus, but large numbers of Aedes (Aedes) rossicus and Aedes (Aedes) cinereus also hatch during flooding events. To increase understanding of which environments in the area give rise to mosquito nuisance, soil samples were taken from 20 locations from four environmental categories: grazed meadows, mowed meadows, unkept open grassland areas and forest areas. In each location 20 soil samples were taken, 10 from random locations and 10 from moisture retaining structures, such as tussocks, shrubs, piles of leaves, logs, and roots. The soil samples were soaked with tap water in the lab, and mosquito larvae were collected and allowed to develop to adult mosquitoes for species identification. Fewer larvae hatched from mowed areas and more larvae hatched from moisture retaining structure samples than random samples. The results showed that Aedes cinereus mostly hatch from grazed and unkept areas and hatched as much from random samples as from structures, whereas Aedes sticticus and Aedes rossicus hatched from open unkept and forest areas and hatch significantly more from structure samples. When the moisture retaining structures in open unkept areas where Aedes sticticus hatched were identified it was clear that they hatched predominantly from willow shrubs that offered shade. The results suggest that Ae. sticticus and Ae. cinereus favor different flooded environments for oviposition.


Sign in / Sign up

Export Citation Format

Share Document