confidence intervals
Recently Published Documents


TOTAL DOCUMENTS

6191
(FIVE YEARS 1081)

H-INDEX

129
(FIVE YEARS 8)

2022 ◽  
Vol 139 ◽  
pp. 1026-1043
Author(s):  
Dawn Iacobucci ◽  
Ayalla Ruvio ◽  
Sergio Román ◽  
Sangkil Moon ◽  
Paul M. Herr

2022 ◽  
Author(s):  
Alberto Celma ◽  
Richard Bade ◽  
Juan V. Sancho ◽  
Félix Hernández ◽  
Melissa Humpries ◽  
...  

Abstract Ultra-high performance liquid chromatography coupled to ion mobility separation and high-resolution mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or non-target approaches (i.e. when no reference standards are available) there is no information on retention time (RT) and collision cross section (CCS) values to facilitate identification. In-silico prediction tools of RT and CCS can therefore be of great utility to decrease the number of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) was evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated molecules, 169 deprotonated molecules and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the empirical data. The RT model (R2=0.855) showed a deviation between predicted and empirical data of ± 2.32 min (95% confidence intervals). The deviation observed for CCS data of protonated molecules using CCSH model (R2=0.966) was ± 4.05% with 95% confidence intervals. The CCSH model was also tested for the prediction of deprotonated molecules resulting in deviations below ± 5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCSNa, R2=0.954) with deviation below ± 5.25% for the 95% of the cases. The developed models have been incorporated in an open access and user-friendly online platform which represents a great advantage for third-party research laboratories for predicting both RT and CCS data.


2022 ◽  
Author(s):  
Mahmudur Rahman Hera ◽  
N Tessa Pierce-Ward ◽  
David Koslicki

Sketching methods offer computational biologists scalable techniques to analyze data sets that continue to grow in size. MinHash is one such technique that has enjoyed recent broad application. However, traditional MinHash has previously been shown to perform poorly when applied to sets of very dissimilar sizes. FracMinHash was recently introduced as a modification of MinHash to compensate for this lack of performance when set sizes differ. While experimental evidence has been encouraging, FracMinHash has not yet been analyzed from a theoretical perspective. In this paper, we perform such an analysis and prove that while FracMinHash is not unbiased, this bias is easily corrected. Next, we detail how a simple mutation model interacts with FracMinHash and are able to derive confidence intervals for evolutionary mutation distances between pairs of sequences as well as hypothesis tests for FracMinHash. We find that FracMinHash estimates the containment of a genome in a large metagenome more accurately and more precisely when compared to traditional MinHash, and the confidence interval performs significantly better in estimating mutation distances. A python-based implementation of the theorems we derive is freely available at https://github.com/KoslickiLab/mutation-rate-ci-calculator. The results presented in this paper can be reproduced using the code at https://github.com/KoslickiLab/ScaledMinHash-reproducibles.


Molecules ◽  
2022 ◽  
Vol 27 (2) ◽  
pp. 457
Author(s):  
Elżbieta Gniazdowska ◽  
Wojciech Goch ◽  
Joanna Giebułtowicz ◽  
Piotr J. Rudzki

Background: The stability of a drug or metabolites in biological matrices is an essential part of bioanalytical method validation, but the justification of its sample size (replicates number) is insufficient. The international guidelines differ in recommended sample size to study stability from no recommendation to at least three quality control samples. Testing of three samples may lead to results biased by a single outlier. We aimed to evaluate the optimal sample size for stability testing based on 90% confidence intervals. Methods: We conducted the experimental, retrospective (264 confidence intervals for the stability of nine drugs during regulatory bioanalytical method validation), and theoretical (mathematical) studies. We generated experimental stability data (40 confidence intervals) for two analytes—tramadol and its major metabolite (O-desmethyl-tramadol)—in two concentrations, two storage conditions, and in five sample sizes (n = 3, 4, 5, 6, or 8). Results: The 90% confidence intervals were wider for low than for high concentrations in 18 out of 20 cases. For n = 5 each stability test passed, and the width of the confidence intervals was below 20%. The results of the retrospective study and the theoretical analysis supported the experimental observations that five or six repetitions ensure that confidence intervals fall within 85–115% acceptance criteria. Conclusions: Five repetitions are optimal for the assessment of analyte stability. We hope to initiate discussion and stimulate further research on the sample size for stability testing.


2022 ◽  
Author(s):  
◽  
Steven Brasell

<p>This research investigates the breakout of security prices from periods of sideways drift known as Triangles. Contributions are made to the existing literature by considering returns conditionally based on Triangles in particular terms of how momentum traders time positions, and by then using alternative statistical methods to more clearly show results. Returns are constructed by scanning for Triangle events, and determining simulated trader returns from predetermined price levels. These are compared with a Naive model consisting of randomly sampled events of comparable measure. Modelling of momentum results is achieved using a marked point Poisson process based approach, used to compare arrival times and profit/losses. These results are confirmed using a set of 10 day return heuristics using bootstrapping to define confidence intervals.  Using these methods applied to CRSP US equity data inclusive from years 1960 to 2017, US equities show a consistent but weak predictable return contribution after Triangle events occur; however, the effect has decreased over time, presumably as the market becomes more efficient. While these observed short term momentum changes in price have likely been compensated to a degree by risk, they do show that such patterns have contained forecastable information about US equities. This shows that prices have likely weakly been affected by past prices, but that currently the effect has reduced to the point that it is of negligible size as of 2017.</p>


2022 ◽  
Author(s):  
◽  
Steven Brasell

<p>This research investigates the breakout of security prices from periods of sideways drift known as Triangles. Contributions are made to the existing literature by considering returns conditionally based on Triangles in particular terms of how momentum traders time positions, and by then using alternative statistical methods to more clearly show results. Returns are constructed by scanning for Triangle events, and determining simulated trader returns from predetermined price levels. These are compared with a Naive model consisting of randomly sampled events of comparable measure. Modelling of momentum results is achieved using a marked point Poisson process based approach, used to compare arrival times and profit/losses. These results are confirmed using a set of 10 day return heuristics using bootstrapping to define confidence intervals.  Using these methods applied to CRSP US equity data inclusive from years 1960 to 2017, US equities show a consistent but weak predictable return contribution after Triangle events occur; however, the effect has decreased over time, presumably as the market becomes more efficient. While these observed short term momentum changes in price have likely been compensated to a degree by risk, they do show that such patterns have contained forecastable information about US equities. This shows that prices have likely weakly been affected by past prices, but that currently the effect has reduced to the point that it is of negligible size as of 2017.</p>


2022 ◽  
Vol 20 (6) ◽  
pp. 7-12
Author(s):  
A. O. Kovrigin ◽  
V. A. Lubennikov ◽  
I. B. Kolyado ◽  
I. V. Vikhlyanov ◽  
A. F. Lazarev ◽  
...  

The purpose of the study was to analyze the cancer incidence in the males born from 1932 to 1949 and living in rural settlements of the municipal districts of the altai Krai affected by the traces from semipalatinsk first nuclear test on august 29, 1949. Material and methods. an epidemiological retrospective cohort study was based on the analysis of anonymized data on newly diagnosed and morphologically verified cases of cancer in a male cohort for the period from 2007 to 2016. the study included a cohort fixed by the date of the first nuclear test with a total of 6383 males. in total, 633 cases were identified in the cohort with newly diagnosed and morphologically verified cancer. at the beginning of the study, all males were alive and had no previous diagnosis of cancer. For a comparative analysis of the cancer incidence, the main (exposed) cohort comprised 2 291 men, and the control cohort included 4 092 men, who lived in rural settlements of municipal districts of the region and were not tracked during the first nuclear test conducted at the semipalatinsk test site. the person-time incidence rate (ptR), standard error (mptR) and confidence intervals (95 % ci) were calculated. the incidence and the relative risk of developing cancer were assessed. statistical analysis was carried out using microsoft office 2016. Results. the number of person-years in the main cohort was 1 6731 person-years, and in the control was 30 747. The incidence rate of person-time (ptR) in the main cohort was 2 032.22 × 105 person-years, with mptR equal to 110.21 and confidence intervals (95 % ci) – (1 811.80–2 252.64). in the control cohort, the corresponding values were: ptR – 952.94 × 105 person-years with mptR – 55,67 and 95 % ci (841.60–1 064.28). the most common cancer localizations in men of the main cohort were: digestive organs (c15-c26), respiratory and chest organs (c30-c39), skin (c43-c44), male genitals (c60-c63). in the control group, the most common localizations were respiratory and chest organs (c30-c39), digestive organs (c15-c26), male genital organs (c60-c63) and skin (c43-c44). Conclusion. an increased relative risk of developing malignant neoplasms in men born and living in the altai territory during the first nuclear test conducted at the semipalatinsk test site was revealed (RR=2.133; 95 % ci 1.824–2.493) with standard error of relative risk (s) equal to 0.0797. there were differences in cancer localization between the main and the control cohorts.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
James H. McVittie ◽  
David B. Wolfson ◽  
Vittorio Addona ◽  
Zhaoheng Li

AbstractWhen modelling the survival distribution of a disease for which the symptomatic progression of the associated condition is insidious, it is not always clear how to measure the failure/censoring times from some true date of disease onset. In a prevalent cohort study with follow-up, one approach for removing any potential influence from the uncertainty in the measurement of the true onset dates is through the utilization of only the residual lifetimes. As the residual lifetimes are measured from a well-defined screening date (prevalence day) to failure/censoring, these observed time durations are essentially error free. Using residual lifetime data, the nonparametric maximum likelihood estimator (NPMLE) may be used to estimate the underlying survival function. However, the resulting estimator can yield exceptionally wide confidence intervals. Alternatively, while parametric maximum likelihood estimation can yield narrower confidence intervals, it may not be robust to model misspecification. Using only right-censored residual lifetime data, we propose a stacking procedure to overcome the non-robustness of model misspecification; our proposed estimator comprises a linear combination of individual nonparametric/parametric survival function estimators, with optimal stacking weights obtained by minimizing a Brier Score loss function.


2022 ◽  
Author(s):  
Daniel Irwin ◽  
David R. Mandel

Organizations in several domains including national security intelligence communicate judgments under uncertainty using verbal probabilities (e.g., likely) instead of numeric probabilities (e.g., 75% chance), despite research indicating that the former have variable meanings across individuals. In the intelligence domain, uncertainty is also communicated using terms such as low, moderate, or high to describe the analyst’s confidence level. However, little research has examined how intelligence professionals interpret these terms and whether they prefer them to numeric uncertainty quantifiers. In two experiments (N = 481 and 624, respectively), uncertainty communication preferences of expert (n = 41 intelligence analysts inExperiment 1) and non-expert intelligence consumers were elicited. We examined which format participants judged to be more informative and simpler to process. We further tested whether participants treated probability and confidence as independent constructs and whether participants provided coherent numeric probability translations of verbal probabilities. Results showed that whereas most non-experts favored the numeric format, experts were about equally split, and most participants in both samples regarded the numeric format as more informative.Experts and non-experts consistently conflated probability and confidence. For instance, confidence intervals inferred from verbal confidence terms had a greater effect on the location of the estimate than the width of the estimate, contrary to normative expectation. Approximately ¼ of experts and over ½ of non-experts provided incoherent numeric probability translations of best estimates and lower and upper bounds when elicitations were spaced by intervening tasks.


2022 ◽  
Author(s):  
Conor G McAloon ◽  
Darren Dahly ◽  
Cathal Walsh ◽  
Patrick Wall ◽  
Breda Smyth ◽  
...  

Rapid Antigen Diagnostic Tests (RADTs) for the detection of SARS-CoV-2 offer advantages in that they are cheaper and faster than currently used PCR tests but have reduced sensitivity and specificity. One potential application of RADTs is to facilitate gatherings of individuals, through testing of attendees at the point of, or immediately prior to entry at a venue. Understanding the baseline risk in the tested population is of particular importance when evaluating the utility of applying diagnostic tests for screening purposes. We used incidence data to estimate the prevalence of infectious individuals in the community at a particular time point and simulated mass gatherings by sampling from a series of age cohorts. Nine different illustrative scenarios were simulated, small (n=100), medium (n=1000) and large (n=10,000) gatherings each with 3 possible age constructs: mostly younger, mostly older or a gathering with equal numbers from each age cohort. For each scenario, we estimated the prevalence of infectious attendees, then simulated the likely number of positive and negative test results, the proportion of cases detected and the corresponding positive and negative predictive values, and the cost per case identified. Our findings suggest that for each detected individual on a given day, there are likely to be 13.8 additional infectious individuals also present in the community. Prevalence of infectious individuals at events was highest with mostly younger attendees (1.00%), followed by homogenous age gatherings (0.55%) and lowest with mostly older events (0.26%). For small events (100 attendees) the expected number of infectious attendees was less than 1 across all age constructs of attendees. For large events (10,000 attendees) the expected number of infectious attendees ranged from 26 (95% confidence intervals 12 to 45) for mostly older events, to almost 100 (95% confidence intervals 46 to 174) infectious attendees for mostly younger attendees. Given rapid changes in SARS-CoV-2 incidence over time, we developed an RShiny app to allow users to run updated simulations for specific events.


Sign in / Sign up

Export Citation Format

Share Document