substantial bias
Recently Published Documents


TOTAL DOCUMENTS

79
(FIVE YEARS 23)

H-INDEX

15
(FIVE YEARS 2)

2022 ◽  
Author(s):  
Noah Bernays ◽  
Daniel Jaffe ◽  
Irina Petropavlovskikh ◽  
Peter Effertz

Abstract. Long et al (2021) conducted a detailed study of possible interferents in measurements of surface O3 by UV spectroscopy, which measures the UV transmission in ambient and O3 scrubbed air. While we appreciate the careful work done in this analysis, there were several omissions and, in one case, the type of scrubber used was mis-identified as manganese dioxide (MnO2), when in fact it was manganese chloride (MnCl2). This misidentification led to the erroneous conclusion that all UV-based O3 instruments employing solid-phase catalytic scrubbers exhibit significant positive artifacts, whereas previous research found this not to be the case when employing MnO2 scrubber types. While the Long study, and our results, confirm the substantial bias in instruments employing an MnCl2 scrubber, a replication of the earlier work with an MnO2 scrubber type and no humidity correction is needed.


2021 ◽  
Author(s):  
Nathan Tardiff ◽  
Lalitta Suriya-Arunroj ◽  
Yale E. Cohen ◽  
Joshua I. Gold

AbstractThe varied effects of expectations on auditory perception are not well understood. For example, both top-down rules and bottom-up stimulus regularities generate expectations that can bias subsequent perceptual judgments. However, it is unknown whether these different sources of bias use the same or different computational and physiological mechanisms. We examined how rule-based and stimulus-based expectations influenced human subjects’ behavior and pupil-linked arousal, a marker of certain forms of expectation-based processing, during an auditory frequency-discrimination task. Rule-based cues biased choice and response times (RTs) toward the more-probable stimulus. In contrast, stimulus-based cues had a complex combination of effects, including choice and RT biases toward and away from the frequency of recently heard stimuli. These different behavioral patterns also had distinct computational signatures, including different modulations of key components of a novel form of a drift-diffusion model, and distinct physiological signatures, including substantial bias-dependent modulations of pupil size in response to rule-based but not stimulus-based cues. These results imply that different sources of expectations can modulate auditory perception via distinct mechanisms: one that uses arousal-linked, rule-based information and another that uses arousal-independent, stimulus-based information to bias the speed and accuracy of auditory perceptual decisions.


2021 ◽  
Author(s):  
Bo Dong ◽  
Xiaoqian Lin ◽  
Xiaohuan Jing ◽  
Tongyuan Hu ◽  
Jianwei Zhou ◽  
...  

Abstract Background: The microbiota hosted in the pig gastrointestinal tract are important to health of this biomedical model. However, the individual species and functional repertoires that make up the pig gut microbiome remain largely undefined. Results: Here we comprehensively investigated the genomes and functions of the piglet gut microbiome using culture-based and metagenomics approaches. A collection included 266 cultured genomes and 482 metagenome-assembled genomes (MAGs) that were clustered to 428 species across 10 phyla was established. Among these clustered species, 333 genomes represent potential new species. Less matches between cultured genomes and MAGs revealed a substantial bias for the acquisition of reference genomes by the two strategies. Glycoside hydrolases was the dominant category of carbohydrate-active enzymes. 445 secondary metabolite biosynthetic genes were predicted from 292 genomes with bacteriocin being the most. Pan genome analysis of Limosilactobacillus reuteri uncover the biosynthesis of reuterin was strain-specific and the production was experimentally determined. Conclusions: A total of 266 isolated bacterial genomes and 482 MAGs were obtained and investigated their functional repertoires. This study provides a comprehensive view of the microbiome composition and the function landscape of the gut of weanling piglets and a valuable bacterial resource for further experimentations.


2021 ◽  
Vol 31 (6) ◽  
Author(s):  
Alix Marie d’Avigneau ◽  
Sumeetpal S. Singh ◽  
Lawrence M. Murray

AbstractDeveloping efficient MCMC algorithms is indispensable in Bayesian inference. In parallel tempering, multiple interacting MCMC chains run to more efficiently explore the state space and improve performance. The multiple chains advance independently through local moves, and the performance enhancement steps are exchange moves, where the chains pause to exchange their current sample amongst each other. To accelerate the independent local moves, they may be performed simultaneously on multiple processors. Another problem is then encountered: depending on the MCMC implementation and inference problem, local moves can take a varying and random amount of time to complete. There may also be infrastructure-induced variations, such as competing jobs on the same processors, which arises in cloud computing. Before exchanges can occur, all chains must complete the local moves they are engaged in to avoid introducing a potentially substantial bias (Proposition 1). To solve this issue of randomly varying local move completion times in multi-processor parallel tempering, we adopt the Anytime Monte Carlo framework of (Murray, L. M., Singh, S., Jacob, P. E., and Lee, A.: Anytime Monte Carlo. arXiv preprintarXiv:1612.03319, (2016): we impose real-time deadlines on the parallel local moves and perform exchanges at these deadlines without any processor idling. We show our methodology for exchanges at real-time deadlines does not introduce a bias and leads to significant performance enhancements over the naïve approach of idling until every processor’s local moves complete. The methodology is then applied in an ABC setting, where an Anytime ABC parallel tempering algorithm is derived for the difficult task of estimating the parameters of a Lotka–Volterra predator-prey model, and similar efficiency enhancements are observed.


2021 ◽  
Author(s):  
Bo Dong ◽  
Xiaoqian Lin ◽  
Xiaohuan Jing ◽  
Tongyuan Hu ◽  
Jianwei Zhou ◽  
...  

The microbiota hosted in the pig gastrointestinal tract are important for productivity of livestock. However, the individual species and functional repertoires that make up the pig gut microbiome remain largely undefined. Here we comprehensively investigated the genomes and functions of the piglet gut microbiome using culture-based and metagenomics approaches. A collection included 266 cultured genomes and 482 metagenome-assembled genomes (MAGs) that were clustered to 428 species across 10 phyla was established. Among these clustered species, 333 genomes represent potential new species. Less matches between cultured genomes and MAGs revealed a substantial bias for the acquisition of reference genomes by the two strategies. Glycoside hydrolases was the dominant category of carbohydrate-active enzymes. 445 secondary metabolite biosynthetic genes were predicted from 292 genomes with bacteriocin being the most. Pan genome analysis of Limosilactobacillus reuteri uncover the biosynthesis of reuterin was strain-specific and the production was experimentally determined. These genomic resources will enable a comprehensive characterization of the microbiome composition and function of pig gut.


2021 ◽  
Author(s):  
◽  
Philip C Haycock ◽  
Maria Carolina Borges ◽  
Kimberly Burrows ◽  
Rozenn N. Lemaitre ◽  
...  

Background: Mendelian randomization studies are susceptible to meta-data errors (e.g. incorrect specification of the effect allele column) and other analytical issues that can introduce substantial bias into analyses. We developed a quality control pipeline for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC) that can be used to identify and correct for such errors. Methods: We invited cancer GWAS to share summary association statistics with the FAMRC and subjected the collated data to a comprehensive QC pipeline. We identified meta data errors through comparison of study-specific statistics to external reference datasets (the NHGRI-EBI GWAS catalog and 1000 genome super populations) and other analytical issues through comparison of reported to expected genetic effect sizes. Comparisons were based on three sets of genetic variants: 1) GWAS hits for fatty acids, 2) GWAS hits for cancer and 3) a 1000 genomes reference set. Results: We collated summary data from six fatty acid and 49 cancer GWAS. Meta data errors and analytical issues with the potential to introduce substantial bias were identified in seven studies (13%). After resolving analytical issues and excluding unreliable data, we created a dataset of 219,842 genetic associations with 87 cancer types. Conclusion: In this large MR collaboration, 13% of included studies were affected by a substantial meta data error or other analytical issue. By increasing the integrity of collated summary data prior to their analysis, our protocol can be used to increase the reliability of post-GWAS analyses. Our pipeline is available to other researchers via the CheckSumStats package (https://github.com/MRCIEU/CheckSumStats).


2021 ◽  
Author(s):  
Fausto Andres Bustos Carrillo ◽  
Brenda Lopez Mercado ◽  
Jairo Carey Monterrey ◽  
Damaris Collado ◽  
Saira Saborio ◽  
...  

Explosive epidemics of chikungunya, Zika, and COVID-19 have recently occurred worldwide, all of which featured large proportions of subclinical infections. Spatial studies of infectious disease epidemics typically use symptomatic infections (cases) to estimate incidence rates (cases/total population), often misinterpreting them as infection risks (infections/total population) or disease risks (cases/infected population). We examined these three measures in a pediatric cohort (N≈3,000) over two chikungunya epidemics and one Zika epidemic and in a household cohort (N=1,793) over one COVID-19 epidemic in Nicaragua. Across different analyses and all epidemics, case incidence rates considerably underestimated both risk-based measures. Spatial infection risk differed from spatial disease risk, and typical case-only approaches precluded a full understanding of the spatial seroprevalence patterns. For epidemics of pathogens that cause many subclinical infections, relying on case-only datasets and misinterpreting incidence rates, as is common, results in substantial bias, a general finding applicable to many pathogens of high human concern.


2021 ◽  
Author(s):  
Robert C Edgar

Phylogenetic tree confidence is often estimated from a multiple sequence alignment (MSA) using the Felsenstein bootstrap heuristic. However, this does not account for systematic errors in the MSA, which may cause substantial bias to the inferred phylogeny. Here, I describe the MSA ensemble bootstrap, a new procedure which generates a set of replicate MSAs by varying parameters such as gap penalties and substitution scores. Such an ensemble is called diagnostic if the typical distance between MSAs is comparable to the error rate. Confidence in a prediction derived from an MSA, e.g. a monophyletic clade, is expressed as the fraction of the ensemble where the prediction is reproduced. This approach is implemented in MUSCLE by modifying the Probcons algorithm, which is based on a hidden Markov model (HMM). An ensemble is generated by perturbing HMM parameters and permuting the guide tree. Ensembles generated by this method are shown to be diagnostic on the Balibase benchmark. To enable scaling to large datasets, divide-and-conquer heuristics are introduced. A new benchmark (Balifam) is described with 36 sets of 10000+ proteins. On Balifam, ensembles generated by MUSCLE are shown to align an average of 59% of columns correctly, 13% better than Clustal-omega (52% correct) and 26% better than MAFFT (47% correct). The ensemble bootstrap is applied to a previously published tree of RNA viruses, showing that the high reported Felsenstein bootstrap confidence of Ribovirus phylum branching order is an artifact of systematic MSA errors.


2021 ◽  
pp. 1-16
Author(s):  
Carlisle Rainey ◽  
Kelly McCaskey

Abstract In small samples, maximum likelihood (ML) estimates of logit model coefficients have substantial bias away from zero. As a solution, we remind political scientists of Firth's (1993, Biometrika, 80, 27–38) penalized maximum likelihood (PML) estimator. Prior research has described and used PML, especially in the context of separation, but its small sample properties remain under-appreciated. The PML estimator eliminates most of the bias and, perhaps more importantly, greatly reduces the variance of the usual ML estimator. Thus, researchers do not face a bias-variance tradeoff when choosing between the ML and PML estimators—the PML estimator has a smaller bias and a smaller variance. We use Monte Carlo simulations and a re-analysis of George and Epstein (1992, American Political Science Review, 86, 323–337) to show that the PML estimator offers a substantial improvement in small samples (e.g., 50 observations) and noticeable improvement even in larger samples (e.g., 1000 observations).


2021 ◽  
Vol 4 ◽  
Author(s):  
Martin Hektoen ◽  
Torbjørn Ekrem ◽  
Torkild Bakken

The completeness of reference libraries is often a limiting factor in the effectiveness of biomonitoring using molecular tools. The fact that these libraries are often built upon Sanger sequencing can create a substantial bias due to poor primer fits and unoptimized lab protocols. Some taxa of marine macroinvertebrates are known to be notoriously difficult to sequence using traditional, PCR-based means, and only about 15% of the known bioindicator species world-wide have publicly available sequences for any genetic marker (Aylagas et al. 2014). The Barcode of Life Data System (BOLD) indicates an amplification rate between 46% and 85% of the barcode region of COI for the most commonly used marine invertebrates that indicate pollution in the North East Atlantic (Capitellidae, Cirratulidae, Dorvilleidae, Spionidae and Tubificidae within Annelida, and Thyasiridae within Mollusca). A currently on-going integrative taxonomic study on Prionospio Malmgren, 1867 (Spionidae, Annelida) exemplifies the extensive issues of utilizing Sanger sequencing on marine invertebrates. The barcode region of COI was attempted amplified using five primer pairs: three designed to be universal for marine invertebrates (Folmer et al. 1994, Geller et al. 2013, Lobo et al. 2013), one specialized on polychaetes (Carr et al. 2011) and one self-designed. In addition, two DNA polymerases were tested (TaKaRa Ex Taq HS and Qiagen HotStarTaq) and three annealing temperatures. Only five sequences of COI were obtained from a total of 255 PCR reactions (2% success rate). Other genetic markers showed better amplification rates: 58% for 16S rDNA, and more than 90% success rates for 28S rDNA and Histone H3. This illustrates the importance of having more than one marker in mind when seeking to complete reference libraries, and the potential effectiveness of a multi-marker approach in molecular biomonitoring surveys. As sequencing costs decrease, utilizing shallow shotgun-based sequencing (genome skimming) on problematic groups such as Prionospio to bypass issues regarding unfit primers is also becoming a viable option.


Sign in / Sign up

Export Citation Format

Share Document