Guide for Verifying Computer-Generated Test Results Through The Use Of Standard Data Sets

10.1520/e2443 ◽  
2008 ◽  
Author(s):  
Keyword(s):  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yance Feng ◽  
Lei M. Li

Abstract Background Normalization of RNA-seq data aims at identifying biological expression differentiation between samples by removing the effects of unwanted confounding factors. Explicitly or implicitly, the justification of normalization requires a set of housekeeping genes. However, the existence of housekeeping genes common for a very large collection of samples, especially under a wide range of conditions, is questionable. Results We propose to carry out pairwise normalization with respect to multiple references, selected from representative samples. Then the pairwise intermediates are integrated based on a linear model that adjusts the reference effects. Motivated by the notion of housekeeping genes and their statistical counterparts, we adopt the robust least trimmed squares regression in pairwise normalization. The proposed method (MUREN) is compared with other existing tools on some standard data sets. The goodness of normalization emphasizes on preserving possible asymmetric differentiation, whose biological significance is exemplified by a single cell data of cell cycle. MUREN is implemented as an R package. The code under license GPL-3 is available on the github platform: github.com/hippo-yf/MUREN and on the conda platform: anaconda.org/hippo-yf/r-muren. Conclusions MUREN performs the RNA-seq normalization using a two-step statistical regression induced from a general principle. We propose that the densities of pairwise differentiations are used to evaluate the goodness of normalization. MUREN adjusts the mode of differentiation toward zero while preserving the skewness due to biological asymmetric differentiation. Moreover, by robustly integrating pre-normalized counts with respect to multiple references, MUREN is immune to individual outlier samples.


1995 ◽  
Vol 78 (6) ◽  
pp. 1513-1515
Author(s):  
Richard H Albert ◽  
William Horwttz

Abstract Three problems arise in handling numerical values in databases: bad data, missing data, and sloppy data. The effects of bad data are mitigated by using statistical subterfuges such as robust statistics or outlier removal. Missing data are replaced by creating a substitute through interpolation or by using statistics appropriate to unbalanced designs. Sloppy, semiquantitative data are relegated to innocuous positions by using nonparametric, rank, or attribute statistics. These techniques are illustrated by the telephone directory, a database of carcinogenicity test results, and a database of precision parameters derived from method performance (collaborative) studies.


2012 ◽  
Vol 58 (9) ◽  
pp. 1364-1367 ◽  
Author(s):  
Vilte E Barakauskas ◽  
Rebecka Davis ◽  
Matthew D Krasowski ◽  
Gwendolyn A McMillin

Abstract BACKGROUND False-positive drug screen results for tetrahydrocannabinol (THC) have been observed. This study investigated the rate of unconfirmed positive screen results in infant and noninfant urine samples and evaluated possible reasons for differences. METHODS The rate of unconfirmed positive THC screen results for urine samples was determined retrospectively in 2 independent data sets (n = 14 859, reference laboratory; n = 21 807, hospital laboratory) by comparing positive immunoassay-based drug screen results with the associated results of confirmation tests. We then assessed the rate of positive THC screens for samples with varying likelihoods of cannabinoid presence to evaluate the contribution of infant-specific urine constituents to positive results. Finally, a method to detect a THC metabolite (11-hydroxy-Δ9-THC) that occurs in meconium was developed to determine its prevalence in infant urine. RESULTS Positive screen results failed to confirm more frequently in samples from infants (47%) than in noninfants (0.8%). The hospital laboratory observed a similar discrepancy with a different immunoassay. Infant samples with a high likelihood of containing cannabinoids despite negative confirmatory results had a similar rate of positive screening results (50%, n = 20), whereas all samples with a low likelihood of containing cannabinoids screened negative (n = 23). 11-Hydroxy-Δ9-THC was not detected in any infant urine sample tested (n = 16). CONCLUSIONS Conventional confirmatory methods for THC may be inappropriate for urine samples from infants. Our results suggest that one or more currently unrecognized THC-associated compounds are responsible for positive THC screen results for infant urine, as opposed to an infant-associated interference.


2012 ◽  
Vol 263-266 ◽  
pp. 1523-1526
Author(s):  
Yan Hai Wu ◽  
Jia Xin Li ◽  
Fang Ni Zhang

This article mainly aims at the problem of the video text font size, achieved a algorithm of multi-scale corner text detection, and combined with the characteristic of word usually has the same color, even more precise positions the text area with color clustering way.This algorithm not only detect the game scene, video ads and video news kind of information in words, but also be used for the natural scene of the text of the detection positioning.Finally, test a public data sets, and the test results show that the proposed method can detect and positioning video the text in the complex information.


Geophysics ◽  
2014 ◽  
Vol 79 (4) ◽  
pp. B135-B149 ◽  
Author(s):  
Elahe P. Ardakani ◽  
Douglas R. Schmitt ◽  
Todd D. Bown

The Devonian Grosmont Formation in northeastern Alberta, Canada, is the world’s largest accumulation of heavy oil in carbonate rock with estimated bitumen in place of [Formula: see text]. Much of the reservoir unconformably subcrops beneath Cretaceous sediments. This is an eroded surface modified by kartstification known as the Sub-Mannville Unconformity (SMU). We studied the reanalysis and integration of legacy seismic data sets obtained in the mid-1980s to investigate the structure of this surface. Standard data processing was carried out supplemented by some more modern approaches to noise reduction. The interpretation of these reprocessed data resulted in some key structural maps above and below the SMU. These seismic maps revealed substantially more detail than those constructed solely on the basis of well-log data; in fact, the use of only well-log information would likely result in erroneous interpretations. Although features smaller than about 40 m in radius cannot be easily discerned at the SMU due to wavefield and data sampling limits, the data did reveal the existence of a roughly east–west-trending ridge-valley system. A more minor northeast–southwest-trending linear valley also was apparent. These observations are all consistent with the model of a karsted/eroded carbonate surface. Comparison of the maps for the differing horizons further suggested that deeper horizons may influence the structure of the SMU and even the overlying Mesozoic formations. This suggested that some displacements due to karst cavity collapse or minor faulting within the Grosmont occurred during or after deposition of the younger Mesozoic sediments on top of the Grosmont surface.


Sign in / Sign up

Export Citation Format

Share Document