On dividing reference data into subgroups to produce separate reference ranges

1990 ◽  
Vol 36 (2) ◽  
pp. 265-270 ◽  
Author(s):  
E K Harris ◽  
J C Boyd

Abstract We consider statistical criteria for partitioning a reference database to obtain separate reference ranges for different subpopulations. Using general formulas relating population variances, sample sizes, and the normal deviate test for the significance of the difference between two subgroup means, we show that partitioning into separate ranges produces little reduction in between-person variability, even when the differences between means are highly significant statistically. However, when there is a clear physiological basis for distinguishing between certain subgroups, simulation studies show that partitioning may be necessary to obtain reference limits that cut off the desired proportions of low and high values in each subgroup. Guidelines based on these results are provided to help decide whether separate ranges should be obtained for a given analyte.

1992 ◽  
Vol 38 (5) ◽  
pp. 648-650 ◽  
Author(s):  
J A Lott ◽  
L C Mitchell ◽  
M L Moeschberger ◽  
D E Sutherland

Abstract We measured Na, K, Cl, glucose, hemoglobin, erythrocytes, and hematocrit in the serum or blood from approximately 800 male and 200 female second-year medical students in an effort to define the size of an acceptable reference population. Using the data from the men and Monte Carlo simulations of 10-400 samples, each carried out 5000 times, we found that for most of the above tests, approximately 200 people are required for stable lower (2.5%) and upper (97.5%) reference limits to be obtained. This agrees with the 198 subjects required by strictly statistical criteria to define the same limits with a 99% confidence level.


Author(s):  
Guanghao Qi ◽  
Nilanjan Chatterjee

Abstract Background Previous studies have often evaluated methods for Mendelian randomization (MR) analysis based on simulations that do not adequately reflect the data-generating mechanisms in genome-wide association studies (GWAS) and there are often discrepancies in the performance of MR methods in simulations and real data sets. Methods We use a simulation framework that generates data on full GWAS for two traits under a realistic model for effect-size distribution coherent with the heritability, co-heritability and polygenicity typically observed for complex traits. We further use recent data generated from GWAS of 38 biomarkers in the UK Biobank and performed down sampling to investigate trends in estimates of causal effects of these biomarkers on the risk of type 2 diabetes (T2D). Results Simulation studies show that weighted mode and MRMix are the only two methods that maintain the correct type I error rate in a diverse set of scenarios. Between the two methods, MRMix tends to be more powerful for larger GWAS whereas the opposite is true for smaller sample sizes. Among the other methods, random-effect IVW (inverse-variance weighted method), MR-Robust and MR-RAPS (robust adjust profile score) tend to perform best in maintaining a low mean-squared error when the InSIDE assumption is satisfied, but can produce large bias when InSIDE is violated. In real-data analysis, some biomarkers showed major heterogeneity in estimates of their causal effects on the risk of T2D across the different methods and estimates from many methods trended in one direction with increasing sample size with patterns similar to those observed in simulation studies. Conclusion The relative performance of different MR methods depends heavily on the sample sizes of the underlying GWAS, the proportion of valid instruments and the validity of the InSIDE assumption. Down-sampling analysis can be used in large GWAS for the possible detection of bias in the MR methods.


2002 ◽  
Vol 53 (6) ◽  
pp. 643 ◽  
Author(s):  
M. J. Robertson ◽  
J. F. Holland ◽  
S. Cawley ◽  
T. D. Potter ◽  
W. Burton ◽  
...  

Canola tolerant to the triazine group of herbicides is grown widely in Australian broad-acre cropping systems. Triazine-tolerant (TT) cultivars are known to have a yield and oil content penalty compared with non-TT cultivars. This study was designed to elucidate the crop physiological basis for the yield differences between the two types. Two commercial cultivars, near-isogenic for the TT trait, were compared in a detailed growth analysis in the field, and 22 crops were compared for phenology and crop attributes at maturity. In the growth analysis study, the TT trait was found to lower radiation use efficiency, which carried through to less biomass at maturity. There were minimal effects on leaf area development and harvest index, and no effect on canopy radiation extinction. Across the 22 crops, where yield varied from 240 to 3400 kg/ha in the non-TT cultivar, yield was on average 26% less in the TT cultivar due to less biomass produced, as there was no significant effect on harvest index. The difference in oil content (2-5%) was greater in low oil content environments. Flowering was delayed by 2-10 days with a greater delay being in later flowering environments. Quantification of the physiological attributes of TT canola allows the assessment of the productivity of different cultivar types across environments.


2020 ◽  
Vol 12 (4) ◽  
pp. 3229-3246
Author(s):  
Magí Franquesa ◽  
Melanie K. Vanderhoof ◽  
Dimitris Stavrakoudis ◽  
Ioannis Z. Gitas ◽  
Ekhi Roteta ◽  
...  

Abstract. Over the past 2 decades, several global burned area products have been produced and released to the public. However, the accuracy assessment of such products largely depends on the availability of reliable reference data that currently do not exist on a global scale or whose production require a high level of dedication of project resources. The important lack of reference data for the validation of burned area products is addressed in this paper. We provide the Burned Area Reference Database (BARD), the first publicly available database created by compiling existing reference BA (burned area) datasets from different international projects. BARD contains a total of 2661 reference files derived from Landsat and Sentinel-2 imagery. All those files have been checked for internal quality and are freely provided by the authors. To ensure database consistency, all files were transformed to a common format and were properly documented by following metadata standards. The goal of generating this database was to give BA algorithm developers and product testers reference information that would help them to develop or validate new BA products. BARD is freely available at https://doi.org/10.21950/BBQQU7 (Franquesa et al., 2020).


2018 ◽  
Vol 7 (6) ◽  
pp. 68
Author(s):  
Karl Schweizer ◽  
Siegbert Reiß ◽  
Stefan Troche

An investigation of the suitability of threshold-based and threshold-free approaches for structural investigations of binary data is reported. Both approaches implicitly establish a relationship between binary data following the binomial distribution on one hand and continuous random variables assuming a normal distribution on the other hand. In two simulation studies we investigated: whether the fit results confirm the establishment of such a relationship, whether the differences between correct and incorrect models are retained and to what degree the sample size influences the results. Both approaches proved to establish the relationship. Using the threshold-free approach it was achieved by customary ML estimation whereas robust ML estimation was necessary in the threshold-based approach. Discrimination between correct and incorrect models was observed for both approaches. Larger CFI differences were found for the threshold-free approach than for the threshold-based approach. Dependency on sample size characterized the threshold-based approach but not the threshold-free approach. The threshold-based approach tended to perform better in large sample sizes, while the threshold-free approach performed better in smaller sample sizes.


2021 ◽  
Author(s):  
Stephan van der Westhuizen ◽  
Gerard Heuvelink ◽  
David Hofmeyr

<p>Digital soil mapping (DSM) may be defined as the use of a statistical model to quantify the relationship between a certain observed soil property at various geographic locations, and a collection of environmental covariates, and then using this relationship to predict the soil property at locations where the property was not measured. It is also important to quantify the uncertainty with regards to prediction of these soil maps. An important source of uncertainty in DSM is measurement error which is considered as the difference between a measured and true value of a soil property.</p><p>The use of machine learning (ML) models such as random forests (RF) has become a popular trend in DSM. This is because ML models tend to be capable of accommodating highly non-linear relationships between the soil property and covariates. However, it is not clear how to incorporate measurement error into ML models. In this presentation we will discuss how to incorporate measurement error into some popular ML models, starting with incorporating weights into the objective function of ML models that implicitly assume a Gaussian error. We will discuss the effect that these modifications have on prediction accuracy, with reference to simulation studies.</p>


2019 ◽  
Vol 9 (4) ◽  
pp. 813-850 ◽  
Author(s):  
Jay Mardia ◽  
Jiantao Jiao ◽  
Ervin Tánczos ◽  
Robert D Nowak ◽  
Tsachy Weissman

Abstract We study concentration inequalities for the Kullback–Leibler (KL) divergence between the empirical distribution and the true distribution. Applying a recursion technique, we improve over the method of types bound uniformly in all regimes of sample size $n$ and alphabet size $k$, and the improvement becomes more significant when $k$ is large. We discuss the applications of our results in obtaining tighter concentration inequalities for $L_1$ deviations of the empirical distribution from the true distribution, and the difference between concentration around the expectation or zero. We also obtain asymptotically tight bounds on the variance of the KL divergence between the empirical and true distribution, and demonstrate their quantitatively different behaviours between small and large sample sizes compared to the alphabet size.


Stats ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 111-120 ◽  
Author(s):  
Dewi Rahardja

We construct a point and interval estimation using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification. Specifically, we derive an easy-to-implement closed-form algorithm for drawing from the posterior distributions. For illustration, we applied our algorithm to a real data example. Finally, we conduct simulation studies to demonstrate the efficiency of our algorithm for Bayesian inference.


2019 ◽  

Current Therapy in Medicine of Australian Mammals provides an update on Australian mammal medicine. Although much of the companion volume, Medicine of Australian Mammals, is still relevant and current, there have been significant advances in Australian mammal medicine and surgery since its publication in 2008. The two texts together remain the most comprehensive source of information available in this field. This volume is divided into two sections. The first includes comprehensive chapters on general topics and topics relevant to multiple taxa. Several new topics are presented including: wildlife health in Australia and the important role veterinarians play in Australia’s biosecurity systems; medical aspects of native mammal reintroductions and translocations; disease risk analysis; wildlife rehabilitation practices in Australia with an emphasis on welfare of animals undergoing rehabilitation; management of overabundant populations; immunology; and stress physiology. The second section provides updates on current knowledge relevant to specific taxa. Several appendices provide useful reference data and information on clinical reference ranges, recommended venipuncture sites, chemical restraint agent doses and regimens, a drug formulary and dental charts. Written by Australian experts, Current Therapy in Medicine of Australian Mammals is clinically oriented, with emphasis on practical content with easy-to-use reference material. It is a must-have for veterinarians, students, biologists, zoologists and wildlife carers and other wildlife professionals. This volume also complements, updates and utilises the resources of other books such as Radiology of Australian Mammals (Vogelnest and Allan 2015), Pathology of Australian Native Wildlife (Ladds 2009), Haematology of Australian Mammals (Clark 2004) and Australian Mammals: Biology and Captive Management (Jackson 2003), all CSIRO Publishing publications.


2006 ◽  
Vol 25 (3) ◽  
pp. 393-404 ◽  
Author(s):  
Matthias Bauer ◽  
Jörg D. Katzenberger ◽  
Anne C. Hamm ◽  
Melanie Bonaus ◽  
Ingo Zinke ◽  
...  

The reallocation of metabolic resources is important for survival during periods of limited nutrient intake. This has an influence on diverse physiological processes, including reproduction, repair, and aging. One important aspect of resource allocation is the difference between males and females in response to nutrient stress. We identified several groups of genes that are regulated in a sex-biased manner under complete or protein starvation. These range from expected differences in genes involved in reproductive physiology to those involved in amino acid utilization, sensory perception, immune response, and growth control. A striking difference was observed in purine and the tightly interconnected folate metabolism upon protein starvation. From these results, we conclude that the purine and folate metabolic pathway is a major point of transcriptional regulation during resource allocation and may have relevance for understanding the physiological basis for the observed tradeoff between reproduction and longevity.


Sign in / Sign up

Export Citation Format

Share Document