dream: Powerful differential expression analysis for repeated measures designs

ABSTRACTLarge-scale transcriptome studies with multiple samples per individual are widely used to study disease biology. Yet current methods for differential expression are inadequate for cross-individual testing for these repeated measures designs. Most problematic, we observe across multiple datasets that current methods can give reproducible false positive findings that are driven by genetic regulation of gene expression, yet are unrelated to the trait of interest. Here we introduce a statistical software package, dream, that increases power, controls the false positive rate, enables multiple types of hypothesis tests, and integrates with standard workflows. In 12 analyses in 6 independent datasets, dream yields biological insight not found with existing software while addressing the issue of reproducible false positive findings. Dream is available within the variancePartition Bioconductor package (http://bioconductor.org/packages/variancePartition).

Download Full-text

Dream: powerful differential expression analysis for repeated measures designs

Bioinformatics ◽

10.1093/bioinformatics/btaa687 ◽

2020 ◽

Cited By ~ 4

Author(s):

Gabriel E Hoffman ◽

Panos Roussos

Keyword(s):

Differential Expression ◽

False Positive ◽

Repeated Measures ◽

Large Scale ◽

False Positive Rate ◽

Differential Expression Analysis ◽

Genetic Regulation ◽

Regulation Of Gene Expression ◽

Supplementary Information ◽

Repeated Measures Designs

Abstract Summary Large-scale transcriptome studies with multiple samples per individual are widely used to study disease biology. Yet, current methods for differential expression are inadequate for cross-individual testing for these repeated measures designs. Most problematic, we observe across multiple datasets that current methods can give reproducible false-positive findings that are driven by genetic regulation of gene expression, yet are unrelated to the trait of interest. Here, we introduce a statistical software package, dream, that increases power, controls the false positive rate, enables multiple types of hypothesis tests, and integrates with standard workflows. In 12 analyses in 6 independent datasets, dream yields biological insight not found with existing software while addressing the issue of reproducible false-positive findings. Availability and implementation Dream is available within the variancePartition Bioconductor package at http://bioconductor.org/packages/variancePartition. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

Bloom Filter-Based Secure Data Forwarding in Large-Scale Cyber-Physical Systems

Mathematical Problems in Engineering ◽

10.1155/2015/150512 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12

Author(s):

Siyu Lin ◽

Hao Wu

Keyword(s):

False Positive ◽

Large Scale ◽

False Positive Rate ◽

Bloom Filter ◽

Cyber Physical Systems ◽

Security Requirements ◽

Data Forwarding ◽

Physical Systems ◽

Secure Data ◽

Positive Rate

Cyber-physical systems (CPSs) connect with the physical world via communication networks, which significantly increases security risks of CPSs. To secure the sensitive data, secure forwarding is an essential component of CPSs. However, CPSs require high dimensional multiattribute and multilevel security requirements due to the significantly increased system scale and diversity, and hence impose high demand on the secure forwarding information query and storage. To tackle these challenges, we propose a practical secure data forwarding scheme for CPSs. Considering the limited storage capability and computational power of entities, we adopt bloom filter to store the secure forwarding information for each entity, which can achieve well balance between the storage consumption and query delay. Furthermore, a novel link-based bloom filter construction method is designed to reduce false positive rate during bloom filter construction. Finally, the effects of false positive rate on the performance of bloom filter-based secure forwarding with different routing policies are discussed.

Download Full-text

Sex differences in the human brain transcriptome of cases with schizophrenia

10.1101/2020.10.05.326405 ◽

2020 ◽

Author(s):

Gabriel E. Hoffman ◽

Yixuan Ma ◽

Kelsey S. Montgomery ◽

Jaroslav Bendl ◽

Manoj Kumar Jaiswal ◽

...

Keyword(s):

Gene Expression ◽

Sex Differences ◽

Differential Expression ◽

Large Scale ◽

Molecular Mechanisms ◽

Age Of Onset ◽

Differential Expression Analysis ◽

Gene Expression Signature ◽

Synaptic Organization ◽

Expression Signature

AbstractWhile schizophrenia differs between males and females in age of onset, symptomatology and the course of the disease, the molecular mechanisms underlying these differences remain uncharacterized. In order to address questions about the sex-specific effects of schizophrenia, we performed a large-scale transcriptome analysis of RNA-seq data from 437 controls and 341 cases from two distinct cohorts from the CommonMind Consortium. Analysis across the cohorts identifies a reproducible gene expression signature of schizophrenia that is highly concordant with previous work. Differential expression across sex is reproducible across cohorts and identifies X- and Y-linked genes, as well as those involved in dosage compensation. Intriguingly, the sex expression signature is also enriched for genes involved in neurexin family protein binding and synaptic organization. Differential expression analysis testing a sex-by-diagnosis interaction effect did not identify any genome-wide signature after multiple testing corrections. Gene coexpression network analysis was performed to reduce dimensionality and elucidate interactions among genes. We found enrichment of co-expression modules for sex-by-diagnosis differential expression signatures, which were highly reproducible across the two cohorts and involve a number of diverse pathways, including neural nucleus development, neuron projection morphogenesis, and regulation of neural precursor cell proliferation. Overall, our results indicate that the effect size of sex differences in schizophrenia gene expression signatures is small and underscore the challenge of identifying robust sex-by-diagnosis signatures, which will require future analyses in larger cohorts.

Download Full-text

Deep Learning Model Improves Radiologists’ Performance in Detection and Classification of Breast Lesions

10.21203/rs.3.rs-746374/v1 ◽

2021 ◽

Author(s):

Ying-Shi Sun ◽

Yu-Hong Qu ◽

Dong Wang ◽

Yi Li ◽

Lin Ye ◽

...

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Roc Curve ◽

False Positive ◽

Large Scale ◽

False Positive Rate ◽

Training Dataset ◽

Validation Dataset ◽

Breast Lesions ◽

Positive Rate

Abstract Background: Computer-aided diagnosis using deep learning algorithms has been initially applied in the field of mammography, but there is no large-scale clinical application.Methods: This study proposed to develop and verify an artificial intelligence model based on mammography. Firstly, retrospectively collected mammograms from six centers were randomized to a training dataset and a validation dataset for establishing the model. Secondly, the model was tested by comparing 12 radiologists’ performance with and without it. Finally, prospectively multicenter mammograms were diagnosed by radiologists with the model. The detection and diagnostic capabilities were evaluated using the free-response receiver operating characteristic (FROC) curve and ROC curve.Results: The sensitivity of model for detecting lesion after matching was 0.908 for false positive rate of 0.25 in unilateral images. The area under ROC curve (AUC) to distinguish the benign from malignant lesions was 0.855 (95% CI: 0.830, 0.880). The performance of 12 radiologists with the model was higher than that of radiologists alone (AUC: 0.852 vs. 0.808, P = 0.005). The mean reading time of with the model was shorter than that of reading alone (80.18 s vs. 62.28 s, P = 0.03). In prospective application, the sensitivity of detection reached 0.887 at false positive rate of 0.25; the AUC of radiologists with the model was 0.983 (95% CI: 0.978, 0.988), with sensitivity, specificity, PPV, and NPV of 94.36%, 98.07%, 87.76%, and 99.09%, respectively.Conclusions: The artificial intelligence model exhibits high accuracy for detecting and diagnosing breast lesions, improves diagnostic accuracy and saves time.Trial registration: NCT, NCT03708978. Registered 17 April 2018, https://register.clinicaltrials.gov/prs/app/ NCT03708978

Download Full-text

PISCES: a package for rapid quantitation and quality control of large scale mRNA-seq datasets

10.1101/2020.12.01.390575 ◽

2020 ◽

Author(s):

Matthew D. Shirley ◽

Viveksagar K. Radhakrishna ◽

Javad Golji ◽

Joshua M. Korn

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

High Performance ◽

Large Scale ◽

Differential Expression Analysis ◽

Link Type ◽

File Formats ◽

Comparison Groups ◽

Reproducible Analysis ◽

High Performance Computing Cluster

AbstractPISCES eases processing of large mRNA-seq experiments by encouraging capture of metadata using simple textual file formats, processing samples on either a single machine or in parallel on a high performance computing cluster (HPC), validating sample identity using genetic fingerprinting, and summarizing all outputs in analysis-ready data matrices. PISCES consists of two modules: 1) compute cluster-aware analysis of individual mRNA-seq libraries including species detection, SNP genotyping, library geometry detection, and quantitation using salmon, and 2) gene-level transcript aggregation, transcriptional and read-based QC, TMM normalization and differential expression analysis of multiple libraries to produce data ready for visualization and further analysis.PISCES is implemented as a python3 package and is bundled with all necessary dependencies to enable reproducible analysis and easy deployment. JSON configuration files are used to build and identify transcriptome indices, and CSV files are used to supply sample metadata and to define comparison groups for differential expression analysis using DEseq2. PISCES builds on many existing open-source tools, and releases of PISCES are available on GitHub or the python package index (PyPI).

Download Full-text

Prediction of Error Associated with False-Positive Rate Determination for Peptide Identification in Large-Scale Proteomics Experiments Using a Combined Reverse and Forward Peptide Sequence Database Strategy

Journal of Proteome Research ◽

10.1021/pr0603194 ◽

2007 ◽

Vol 6 (1) ◽

pp. 392-398 ◽

Cited By ~ 49

Author(s):

Edward L. Huttlin ◽

Adrian D. Hegeman ◽

Amy C. Harms ◽

Michael R. Sussman

Keyword(s):

False Positive ◽

Large Scale ◽

False Positive Rate ◽

Peptide Identification ◽

Peptide Sequence ◽

Sequence Database ◽

Positive Rate

Download Full-text

Impact of RNA-seq attributes on false positive rates in differential expression analysis of de novo assembled transcriptomes

BMC Research Notes ◽

10.1186/1756-0500-6-503 ◽

2013 ◽

Vol 6 (1) ◽

Cited By ~ 13

Author(s):

Emmanuel González ◽

Simon Joly

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

False Positive ◽

De Novo ◽

Differential Expression Analysis ◽

Rna Seq

Download Full-text

A collaborative approach for national cybersecurity incident management

Information and Computer Security ◽

10.1108/ics-02-2020-0027 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Oluwafemi Oriola ◽

Adesesan Barnabas Adeyemo ◽

Maria Papadaki ◽

Eduan Kotzé

Keyword(s):

False Positive ◽

Large Scale ◽

False Positive Rate ◽

Incident Management ◽

Collaborative Approach ◽

Total Response ◽

Content Type ◽

Attack Scenario ◽

Positive Rate ◽

Incident Handling

Purpose Collaborative-based national cybersecurity incident management benefits from the huge size of incident information, large-scale information security devices and aggregation of security skills. However, no existing collaborative approach has been able to cater for multiple regulators, divergent incident views and incident reputation trust issues that national cybersecurity incident management presents. This paper aims to propose a collaborative approach to handle these issues cost-effectively. Design/methodology/approach A collaborative-based national cybersecurity incident management architecture based on ITU-T X.1056 security incident management framework is proposed. It is composed of the cooperative regulatory unit with cooperative and third-party management strategies and an execution unit, with incident handling and response strategies. Novel collaborative incident prioritization and mitigation planning models that are fit for incident handling in national cybersecurity incident management are proposed. Findings Use case depicting how the collaborative-based national cybersecurity incident management would function within a typical information and communication technology ecosystem is illustrated. The proposed collaborative approach is evaluated based on the performances of an experimental cyber-incident management system against two multistage attack scenarios. The results show that the proposed approach is more reliable compared to the existing ones based on descriptive statistics. Originality/value The approach produces better incident impact scores and rankings than standard tools. The approach reduces the total response costs by 8.33% and false positive rate by 97.20% for the first attack scenario, while it reduces the total response costs by 26.67% and false positive rate by 78.83% for the second attack scenario.

Download Full-text

rmRNAseq: differential expression analysis for repeated-measures RNA-seq data

Bioinformatics ◽

10.1093/bioinformatics/btaa525 ◽

2020 ◽

Vol 36 (16) ◽

pp. 4432-4439 ◽

Cited By ~ 1

Author(s):

Yet Nguyen ◽

Dan Nettleton

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Repeated Measures ◽

Differential Expression Analysis ◽

Parametric Bootstrap ◽

R Package ◽

Experimental Unit ◽

Multiple Time ◽

Rna Seq ◽

Sequencing Technologies

Abstract Motivation With the reduction in price of next-generation sequencing technologies, gene expression profiling using RNA-seq has increased the scope of sequencing experiments to include more complex designs, such as designs involving repeated measures. In such designs, RNA samples are extracted from each experimental unit at multiple time points. The read counts that result from RNA sequencing of the samples extracted from the same experimental unit tend to be temporally correlated. Although there are many methods for RNA-seq differential expression analysis, existing methods do not properly account for within-unit correlations that arise in repeated-measures designs. Results We address this shortcoming by using normalized log-transformed counts and associated precision weights in a general linear model pipeline with continuous autoregressive structure to account for the correlation among observations within each experimental unit. We then utilize parametric bootstrap to conduct differential expression inference. Simulation studies show the advantages of our method over alternatives that do not account for the correlation among observations within experimental units. Availability and implementation We provide an R package rmRNAseq implementing our proposed method (function TC_CAR1) at https://cran.r-project.org/web/packages/rmRNAseq/index.html. Reproducible R codes for data analysis and simulation are available at https://github.com/ntyet/rmRNAseq/tree/master/simulation.

Download Full-text