Refining random samples

2022 ◽  
pp. 37-55
Keyword(s):  
1976 ◽  
Vol 36 (01) ◽  
pp. 071-077 ◽  
Author(s):  
Daniel E. Whitman ◽  
Mary Ellen Switzer ◽  
Patrick A. McKee

SummaryThe availability of factor VIII concentrates is frequently a limitation in the management of classical hemophilia. Such concentrates are prepared from fresh or fresh-frozen plasma. A significant volume of plasma in the United States becomes “indated”, i. e., in contact with red blood cells for 24 hours at 4°, and is therefore not used to prepare factor VIII concentrates. To evaluate this possible resource, partially purified factor VIII was prepared from random samples of fresh-frozen, indated and outdated plasma. The yield of factor VIII protein and procoagulant activity from indated plasma was about the same as that from fresh-frozen plasma. The yield from outdated plasma was substantially less. After further purification, factor VIII from the three sources gave a single subunit band when reduced and analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis. These results indicate that the approximately 287,000 liters of indated plasma processed annually by the American National Red Cross (ANRC) could be used to prepare factor VIII concentrates of good quality. This resource alone could quadruple the supply of factor VIII available for therapy.


Science ◽  
2021 ◽  
pp. eabh0635
Author(s):  
James A. Hay ◽  
Lee Kennedy-Shaffer ◽  
Sanjat Kanjilal ◽  
Niall J. Lennon ◽  
Stacey B. Gabriel ◽  
...  

Estimating an epidemic’s trajectory is crucial for developing public health responses to infectious diseases, but case data used for such estimation are confounded by variable testing practices. We show that the population distribution of viral loads observed under random or symptom-based surveillance, in the form of cycle threshold (Ct) values obtained from reverse-transcription quantitative polymerase chain reaction testing, changes during an epidemic. Thus, Ct values from even limited numbers of random samples can provide improved estimates of an epidemic’s trajectory. Combining data from multiple such samples improves the precision and robustness of such estimation. We apply our methods to Ct values from surveillance conducted during the SARS-CoV-2 pandemic in a variety of settings and offer alternative approaches for real-time estimates of epidemic trajectories for outbreak management and response.


Author(s):  
Frank Ecker ◽  
Jennifer Francis ◽  
Per Olsson ◽  
Katherine Schipper

AbstractThis paper investigates how data requirements often encountered in archival accounting research can produce a data-restricted sample that is a non-random selection of observations from the reference sample to which the researcher wishes to generalize results. We illustrate the effects of non-random sampling on results of association tests in a setting with data on one variable of interest for all observations and frequently-missing data on another variable of interest. We develop and validate a resampling approach that uses only observations from the data-restricted sample to construct distribution-matched samples that approximate randomly-drawn samples from the reference sample. Our simulation tests provide evidence that distribution-matched samples yield generalizable results. We demonstrate the effects of non-random sampling in tests of the association between realized returns and five implied cost of equity metrics. In this setting, the reference sample has full information on realized returns, while on average only 16% of reference sample observations have data on cost of equity metrics. Consistent with prior research (e.g., Easton and Monahan The Accounting Review 80, 501–538, 2005), analysis using the unadjusted (non-random) cost of equity sample reveals weak or negative associations between realized returns and cost of equity metrics. In contrast, using distribution-matched samples, we find reliable evidence of the theoretically-predicted positive association. We also conceptually and empirically compare distribution-matching with multiple imputation and selection models, two other approaches to dealing with non-random samples.


Insects ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 279
Author(s):  
Anders Lindström ◽  
Disa Eklöf ◽  
Tobias Lilja

In the lower Dalälven region, floodwater mosquitoes cause recurring problems. The main nuisance species is Aedes (Ochlerotatus) sticticus, but large numbers of Aedes (Aedes) rossicus and Aedes (Aedes) cinereus also hatch during flooding events. To increase understanding of which environments in the area give rise to mosquito nuisance, soil samples were taken from 20 locations from four environmental categories: grazed meadows, mowed meadows, unkept open grassland areas and forest areas. In each location 20 soil samples were taken, 10 from random locations and 10 from moisture retaining structures, such as tussocks, shrubs, piles of leaves, logs, and roots. The soil samples were soaked with tap water in the lab, and mosquito larvae were collected and allowed to develop to adult mosquitoes for species identification. Fewer larvae hatched from mowed areas and more larvae hatched from moisture retaining structure samples than random samples. The results showed that Aedes cinereus mostly hatch from grazed and unkept areas and hatched as much from random samples as from structures, whereas Aedes sticticus and Aedes rossicus hatched from open unkept and forest areas and hatch significantly more from structure samples. When the moisture retaining structures in open unkept areas where Aedes sticticus hatched were identified it was clear that they hatched predominantly from willow shrubs that offered shade. The results suggest that Ae. sticticus and Ae. cinereus favor different flooded environments for oviposition.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 740
Author(s):  
Hoshin V. Gupta ◽  
Mohammad Reza Ehsani ◽  
Tirthankar Roy ◽  
Maria A. Sans-Fuentes ◽  
Uwe Ehret ◽  
...  

We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) and Kernel Density (KD) methods. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. And, whereas BC and KD each require optimal tuning of a hyper-parameter whose value varies with sample size and shape of the pdf, QS only requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantiles is a fixed fraction of the sample size (empirically determined to be ~0.25–0.35), and that this value is relatively insensitive to distributional form or sample size. This provides a clear advantage over BC and KD since hyper-parameter tuning is not required. Further, unlike KD, there is no need to select an appropriate kernel-type, and so QS is applicable to pdfs of arbitrary shape, including those with discontinuous slope and/or magnitude. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is low even for samples of as few as 100 data points; in contrast, for KD the small sample bias can be as large as -10% and for BC as large as -50%. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.


2016 ◽  
Vol 82 (0) ◽  
Author(s):  
Leonardo Morais Turchen ◽  
Vanessa Golin ◽  
Bruna Magda Favetti ◽  
Alessandra Regina Butnariu ◽  
Valmir Antônio Costa

The neotropical stink brown bug, Euschistus heros (F.) (Hemiptera: Pentatomidae), is an insect pest to soybean crops in Mato Grosso State, Brazil. In this region, synthetic insecticides are frequently used for insect control. An alternative to the indiscriminate use of insecticides is the biological control with parasitoids. Thus, the objective of this study was to conduct the survey of parasitoids that use E. heros adults as hosts. Random samples were conducted during the harvests of 2009/10 and 2010/11 in two farms that produce soybean (conventional system) in Tangará da Serra, Mato Grosso State, Brazil. The total number of collected E. heros was: 297 (Field 1) and 293 (Field 2) in 2009/10 and 295 (Field 1) and 376 (Field 2) in 2010/11. Of these, 1.50 (Field 1) and 13.99% (Field 2) were parasitized in 2009/10 and 8.47 (Field 1) and 7.45% (Field 2) in 2010/11. The parasitoids found were Hexacladia smithii Ashmead (Hymenoptera: Encyrtidae) in both fields. This is the first record of parasitism in E. heros adults in the state of Mato Grosso, Brazil.


1995 ◽  
Vol 45 (1-2) ◽  
pp. 61-72 ◽  
Author(s):  
Mark Carpenter ◽  
Nabendu Pal

Assume independent random samples are drawn from two populations which are exponentially distributed with unknown location parameters and a common unknown scale parameter. The interest in this paper is to estimate the minimum and maximum of the unknown location parameters. Several estimators are proposed and their properties in terms of MSB and absolute bias are studied and compared.


1979 ◽  
Vol 30 (1) ◽  
pp. 25 ◽  
Author(s):  
AJ Butler ◽  
FJ Brewster

Fourteen random samples of Pinna bicolor were collected over a period of 31 months from 6 m depth in Gulf St Vincent off Edithburgh, South Australia. The length-frequency distributions suggest that: P. bicolor larvae settle in spring but with variable success; growth of newly settled young is rapid over summer; by age 1 year their modal shell length is about 20 cm; by age 2 it is about 26 cm; they may survive substantially longer than 3 years so that a length-class of mode c. 35 cm is always present and is composed of several age-classes not necessarily equally represented. These suggestions are corroborated by limited data on adductor muscle scars, the development of epibiota on the shells, and the growth and survival of tagged animals over 9 months.


2013 ◽  
Vol 321-324 ◽  
pp. 1939-1942
Author(s):  
Lei Gu

The locality sensitive k-means clustering method has been presented recently. Although this approach can improve the clustering accuracies, it often gains the unstable clustering results because some random samples are employed for the initial centers. In this paper, an initialization method based on the core clusters is used for the locality sensitive k-means clustering. The core clusters can be formed by constructing the σ-neighborhood graph and their centers are regarded as the initial centers of the locality sensitive k-means clustering. To investigate the effectiveness of our approach, several experiments are done on three datasets. Experimental results show that our proposed method can improve the clustering performance compared to the previous locality sensitive k-means clustering.


Sign in / Sign up

Export Citation Format

Share Document