Refining random samples

SummaryThe availability of factor VIII concentrates is frequently a limitation in the management of classical hemophilia. Such concentrates are prepared from fresh or fresh-frozen plasma. A significant volume of plasma in the United States becomes “indated”, i. e., in contact with red blood cells for 24 hours at 4°, and is therefore not used to prepare factor VIII concentrates. To evaluate this possible resource, partially purified factor VIII was prepared from random samples of fresh-frozen, indated and outdated plasma. The yield of factor VIII protein and procoagulant activity from indated plasma was about the same as that from fresh-frozen plasma. The yield from outdated plasma was substantially less. After further purification, factor VIII from the three sources gave a single subunit band when reduced and analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis. These results indicate that the approximately 287,000 liters of indated plasma processed annually by the American National Red Cross (ANRC) could be used to prepare factor VIII concentrates of good quality. This resource alone could quadruple the supply of factor VIII available for therapy.

Download Full-text

Estimating epidemiologic dynamics from cross-sectional viral load distributions

Science ◽

10.1126/science.abh0635 ◽

2021 ◽

pp. eabh0635

Author(s):

James A. Hay ◽

Lee Kennedy-Shaffer ◽

Sanjat Kanjilal ◽

Niall J. Lennon ◽

Stacey B. Gabriel ◽

...

Keyword(s):

Population Distribution ◽

Quantitative Polymerase Chain Reaction ◽

Cross Sectional ◽

Outbreak Management ◽

Viral Loads ◽

Time Estimates ◽

Load Distributions ◽

Random Samples ◽

Combining Data ◽

Polymerase Chain

Estimating an epidemic’s trajectory is crucial for developing public health responses to infectious diseases, but case data used for such estimation are confounded by variable testing practices. We show that the population distribution of viral loads observed under random or symptom-based surveillance, in the form of cycle threshold (Ct) values obtained from reverse-transcription quantitative polymerase chain reaction testing, changes during an epidemic. Thus, Ct values from even limited numbers of random samples can provide improved estimates of an epidemic’s trajectory. Combining data from multiple such samples improves the precision and robustness of such estimation. We apply our methods to Ct values from surveillance conducted during the SARS-CoV-2 pandemic in a variety of settings and offer alternative approaches for real-time estimates of epidemic trajectories for outbreak management and response.

Download Full-text

Non-random sampling and association tests on realized returns and risk proxies

Review of Accounting Studies ◽

10.1007/s11142-021-09581-0 ◽

2021 ◽

Author(s):

Frank Ecker ◽

Jennifer Francis ◽

Per Olsson ◽

Katherine Schipper

Keyword(s):

Random Sampling ◽

Reference Sample ◽

Positive Association ◽

Cost Of Equity ◽

Association Tests ◽

Random Samples ◽

Distribution Matching ◽

Matched Samples ◽

Data Requirements ◽

Selection Of

AbstractThis paper investigates how data requirements often encountered in archival accounting research can produce a data-restricted sample that is a non-random selection of observations from the reference sample to which the researcher wishes to generalize results. We illustrate the effects of non-random sampling on results of association tests in a setting with data on one variable of interest for all observations and frequently-missing data on another variable of interest. We develop and validate a resampling approach that uses only observations from the data-restricted sample to construct distribution-matched samples that approximate randomly-drawn samples from the reference sample. Our simulation tests provide evidence that distribution-matched samples yield generalizable results. We demonstrate the effects of non-random sampling in tests of the association between realized returns and five implied cost of equity metrics. In this setting, the reference sample has full information on realized returns, while on average only 16% of reference sample observations have data on cost of equity metrics. Consistent with prior research (e.g., Easton and Monahan The Accounting Review 80, 501–538, 2005), analysis using the unadjusted (non-random) cost of equity sample reveals weak or negative associations between realized returns and cost of equity metrics. In contrast, using distribution-matched samples, we find reliable evidence of the theoretically-predicted positive association. We also conceptually and empirically compare distribution-matching with multiple imputation and selection models, two other approaches to dealing with non-random samples.

Download Full-text

Different Hatching Rates of Floodwater Mosquitoes Aedes sticticus, Aedes rossicus and Aedes cinereus from Different Flooded Environments

Insects ◽

10.3390/insects12040279 ◽

2021 ◽

Vol 12 (4) ◽

pp. 279

Author(s):

Anders Lindström ◽

Disa Eklöf ◽

Tobias Lilja

Keyword(s):

Tap Water ◽

Soil Samples ◽

Mosquito Larvae ◽

Retaining Structures ◽

Retaining Structure ◽

Large Numbers ◽

Random Samples ◽

Nuisance Species ◽

Flooding Events ◽

Hatching Rates

In the lower Dalälven region, floodwater mosquitoes cause recurring problems. The main nuisance species is Aedes (Ochlerotatus) sticticus, but large numbers of Aedes (Aedes) rossicus and Aedes (Aedes) cinereus also hatch during flooding events. To increase understanding of which environments in the area give rise to mosquito nuisance, soil samples were taken from 20 locations from four environmental categories: grazed meadows, mowed meadows, unkept open grassland areas and forest areas. In each location 20 soil samples were taken, 10 from random locations and 10 from moisture retaining structures, such as tussocks, shrubs, piles of leaves, logs, and roots. The soil samples were soaked with tap water in the lab, and mosquito larvae were collected and allowed to develop to adult mosquitoes for species identification. Fewer larvae hatched from mowed areas and more larvae hatched from moisture retaining structure samples than random samples. The results showed that Aedes cinereus mostly hatch from grazed and unkept areas and hatched as much from random samples as from structures, whereas Aedes sticticus and Aedes rossicus hatched from open unkept and forest areas and hatch significantly more from structure samples. When the moisture retaining structures in open unkept areas where Aedes sticticus hatched were identified it was clear that they hatched predominantly from willow shrubs that offered shade. The results suggest that Ae. sticticus and Ae. cinereus favor different flooded environments for oviposition.

Download Full-text

Computing Accurate Probabilistic Estimates of One-D Entropy from Equiprobable Random Samples

Entropy ◽

10.3390/e23060740 ◽

2021 ◽

Vol 23 (6) ◽

pp. 740

Author(s):

Hoshin V. Gupta ◽

Mohammad Reza Ehsani ◽

Tirthankar Roy ◽

Maria A. Sans-Fuentes ◽

Uwe Ehret ◽

...

Keyword(s):

Sample Size ◽

Parameter Tuning ◽

Gaussian Mixture ◽

Optimal Number ◽

Small Sample ◽

Random Samples ◽

Data Points ◽

Probability Mass ◽

Entropy Estimate ◽

Log Normal

We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) and Kernel Density (KD) methods. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. And, whereas BC and KD each require optimal tuning of a hyper-parameter whose value varies with sample size and shape of the pdf, QS only requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantiles is a fixed fraction of the sample size (empirically determined to be ~0.25–0.35), and that this value is relatively insensitive to distributional form or sample size. This provides a clear advantage over BC and KD since hyper-parameter tuning is not required. Further, unlike KD, there is no need to select an appropriate kernel-type, and so QS is applicable to pdfs of arbitrary shape, including those with discontinuous slope and/or magnitude. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is low even for samples of as few as 100 data points; in contrast, for KD the small sample bias can be as large as -10% and for BC as large as -50%. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.

Download Full-text

Natural parasitism of Hexacladia smithii Ashmead (Hymenoptera: Encyrtidae) on Euschistus heros (F.) (Hemiptera: Pentatomidae): new record from Mato Grosso State, Brazil

Arquivos do Instituto Biológico ◽

10.1590/1808-1657000852013 ◽

2016 ◽

Vol 82 (0) ◽

Cited By ~ 3

Author(s):

Leonardo Morais Turchen ◽

Vanessa Golin ◽

Bruna Magda Favetti ◽

Alessandra Regina Butnariu ◽

Valmir Antônio Costa

Keyword(s):

Biological Control ◽

New Record ◽

Insect Pest ◽

First Record ◽

Insect Control ◽

Conventional System ◽

Mato Grosso ◽

Random Samples ◽

Euschistus Heros ◽

Natural Parasitism

The neotropical stink brown bug, Euschistus heros (F.) (Hemiptera: Pentatomidae), is an insect pest to soybean crops in Mato Grosso State, Brazil. In this region, synthetic insecticides are frequently used for insect control. An alternative to the indiscriminate use of insecticides is the biological control with parasitoids. Thus, the objective of this study was to conduct the survey of parasitoids that use E. heros adults as hosts. Random samples were conducted during the harvests of 2009/10 and 2010/11 in two farms that produce soybean (conventional system) in Tangará da Serra, Mato Grosso State, Brazil. The total number of collected E. heros was: 297 (Field 1) and 293 (Field 2) in 2009/10 and 295 (Field 1) and 376 (Field 2) in 2010/11. Of these, 1.50 (Field 1) and 13.99% (Field 2) were parasitized in 2009/10 and 8.47 (Field 1) and 7.45% (Field 2) in 2010/11. The parasitoids found were Hexacladia smithii Ashmead (Hymenoptera: Encyrtidae) in both fields. This is the first record of parasitism in E. heros adults in the state of Mato Grosso, Brazil.

Download Full-text

Estimation of the Smaller and Larger of Two Exponential Location Parameters with a Common Unknown Scale Parameter

Calcutta Statistical Association Bulletin ◽

10.1177/0008068319950103 ◽

1995 ◽

Vol 45 (1-2) ◽

pp. 61-72 ◽

Cited By ~ 3

Author(s):

Mark Carpenter ◽

Nabendu Pal

Keyword(s):

Scale Parameter ◽

Random Samples ◽

Location Parameters ◽

Absolute Bias ◽

Two Populations

Assume independent random samples are drawn from two populations which are exponentially distributed with unknown location parameters and a common unknown scale parameter. The interest in this paper is to estimate the minimum and maximum of the unknown location parameters. Several estimators are proposed and their properties in terms of MSB and absolute bias are studied and compared.

Download Full-text

An approach to robust optimization of impact problems using random samples and meta-modelling

International Journal of Impact Engineering ◽

10.1016/j.ijimpeng.2009.07.002 ◽

2010 ◽

Vol 37 (6) ◽

pp. 723-734 ◽

Cited By ~ 17

Author(s):

David Lönn ◽

Ørjan Fyllingen ◽

Larsgunnar Nilssona

Keyword(s):

Robust Optimization ◽

Random Samples ◽

Impact Problems

Download Full-text

Size Distributions and Growth of the Fan-shell Pinna bicolor Gmelin (Mollusca : Eulamellibranchia) in South Australia

Marine and Freshwater Research ◽

10.1071/mf9790025 ◽

1979 ◽

Vol 30 (1) ◽

pp. 25 ◽

Cited By ~ 27

Author(s):

AJ Butler ◽

FJ Brewster

Keyword(s):

South Australia ◽

Adductor Muscle ◽

Size Distributions ◽

Limited Data ◽

Length Frequency ◽

Age Classes ◽

Frequency Distributions ◽

Growth And Survival ◽

Random Samples ◽

Variable Success

Fourteen random samples of Pinna bicolor were collected over a period of 31 months from 6 m depth in Gulf St Vincent off Edithburgh, South Australia. The length-frequency distributions suggest that: P. bicolor larvae settle in spring but with variable success; growth of newly settled young is rapid over summer; by age 1 year their modal shell length is about 20 cm; by age 2 it is about 26 cm; they may survive substantially longer than 3 years so that a length-class of mode c. 35 cm is always present and is composed of several age-classes not necessarily equally represented. These suggestions are corroborated by limited data on adductor muscle scars, the development of epibiota on the shells, and the growth and survival of tagged animals over 9 months.

Download Full-text

A Novel Locality Sensitive K-Means Clustering Algorithm based on Core Clusters

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.1939 ◽

2013 ◽

Vol 321-324 ◽

pp. 1939-1942

Author(s):

Lei Gu

Keyword(s):

Clustering Algorithm ◽

Experimental Results ◽

Clustering Method ◽

Neighborhood Graph ◽

The Core ◽

Random Samples

The locality sensitive k-means clustering method has been presented recently. Although this approach can improve the clustering accuracies, it often gains the unstable clustering results because some random samples are employed for the initial centers. In this paper, an initialization method based on the core clusters is used for the locality sensitive k-means clustering. The core clusters can be formed by constructing the σ-neighborhood graph and their centers are regarded as the initial centers of the locality sensitive k-means clustering. To investigate the effectiveness of our approach, several experiments are done on three datasets. Experimental results show that our proposed method can improve the clustering performance compared to the previous locality sensitive k-means clustering.

Download Full-text