Concepts and Software Package for Efficient Quality Control in Targeted Metabolomics Studies – MeTaQuaC

ABSTRACTTargeted quantitative mass spectrometry metabolite profiling is the workhorse of metabolomics research. Robust and reproducible data is essential for confidence in analytical results and is particularly important with large-scale studies. Commercial kits are now available which use carefully calibrated and validated internal and external standards to provide such reliability. However, they are still subject to processing and technical errors in their use and should be subject to a laboratory’s routine quality assurance and quality control measures to maintain confidence in the results. We discuss important systematic and random measurement errors when using these kits and suggest measures to detect and quantify them. We demonstrate how wider analysis of the entire dataset, alongside standard analyses of quality control samples can be used to identify outliers and quantify systematic trends in order to improve downstream analysis. Finally we present the MeTaQuaC software which implements the above concepts and methods for Biocrates kits and creates a comprehensive quality control report containing rich visualization and informative scores and summary statistics. Preliminary unsupervised multivariate analysis methods are also included to provide rapid insight into study variables and groups. MeTaQuaC is provided as an open source R package under a permissive MIT license and includes detailed user documentation.

Download Full-text

Liposomes

Advances in Medical Technologies and Clinical Practice - Novel Approaches for Drug Delivery ◽

10.4018/978-1-5225-0751-2.ch003 ◽

2017 ◽

pp. 52-87

Author(s):

Mangal Shailesh Nagarsenker ◽

Megha Sunil Marwah

Keyword(s):

Quality Control ◽

Large Scale ◽

Clinical Success ◽

Scale Production ◽

Large Scale Production ◽

Diagnostic Applications ◽

Set Up ◽

Insight Into ◽

Characterisation Techniques

The science of liposomes has expanded in ambit from bench to clinic through industrial production in thirty years since the naissance of the concept. This chapter makes an attempt to bring to light the impregnable contributions of great researchers in the field of liposomology that has witnessed clinical success in the recent times. The journey which began in 1965 with the observations of Bangham and further advances made en route (targeting/stealthing of liposomes) along with alternative and potential liposome forming amphiphiles has been highlighted in this chapter. The authors have also summarised the conventional and novel industrially feasible methods used to formulate liposomes in addition to characterisation techniques which have been used to set up quality control standards for large scale production. Besides, the authors have provided with an overview of primary therapeutic and diagnostic applications and a brief insight into the in vivo behaviour of liposomes.

Download Full-text

PEPATAC: An optimized ATAC-seq pipeline with serial alignments

10.1101/2020.10.21.347054 ◽

2020 ◽

Author(s):

Jason P. Smith ◽

M. Ryan Corces ◽

Jin Xu ◽

Vincent P. Reuter ◽

Howard Y. Chang ◽

...

Keyword(s):

Quality Control ◽

Large Scale ◽

Fault Tolerant ◽

Chromatin Accessibility ◽

Resource Manager ◽

Specific Data ◽

Data Formats ◽

Quality Control Metrics ◽

Downstream Analysis ◽

Analytical Approaches

MotivationAs chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects.ResultsPEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project.AvailabilityBSD2-licensed code and documentation at https://pepatac.databio.org.

Download Full-text

genuMet: distinguish genuine untargeted metabolic features without quality control samples

10.1101/837260 ◽

2019 ◽

Cited By ~ 1

Author(s):

L Cao ◽

C Clish ◽

FB Hu ◽

MA Martínez-González ◽

C Razquin ◽

...

Keyword(s):

Quality Control ◽

Large Scale ◽

Control Method ◽

R Package ◽

True Positive Rate ◽

Supplementary Information ◽

True Negative ◽

True Negative Rate ◽

Positive Rate ◽

Measurement Artifacts

AbstractMotivationLarge-scale untargeted metabolomics experiments lead to detection of thousands of novel metabolic features as well as false positive artifacts. With the incorporation of pooled QC samples and corresponding bioinformatics algorithms, those measurement artifacts can be well quality controlled. However, it is impracticable for all the studies to apply such experimental design.ResultsWe introduce a post-alignment quality control method called genuMet, which is solely based on injection order of biological samples to identify potential false metabolic features. In terms of the missing pattern of metabolic signals, genuMet can reach over 95% true negative rate and 85% true positive rate with suitable parameters, compared with the algorithm utilizing pooled QC samples. genu-Met makes it possible for studies without pooled QC samples to reduce false metabolic signals and perform robust statistical analysis.Availability and implementationgenuMet is implemented in a R package and available on https://github.com/liucaomics/genuMet under GPL-v2 license.ContactLiming Liang: [email protected] informationSupplementary data are available at ….

Download Full-text

sigQC: A procedural approach for standardising the evaluation of gene signatures

10.1101/203729 ◽

2017 ◽

Cited By ~ 4

Author(s):

Andrew Dhawan ◽

Alessandro Barberis ◽

Wei-Chen Cheng ◽

Enric Domingo ◽

Catharine West ◽

...

Keyword(s):

Gene Expression ◽

Quality Control ◽

Large Scale ◽

Gene Expression Signature ◽

R Package ◽

Gene Signature ◽

Gene Signatures ◽

Expression Variability ◽

Procedural Approach ◽

Standard Quality

AbstractWith the increase in next generation sequencing generating large amounts of genomic data, gene expression signatures are becoming critically important tools, poised to make a large impact on the diagnosis, management and prognosis for a number of diseases. Increasingly, it is becoming necessary to determine whether a gene expression signature may apply to a dataset, but no standard quality control methodology exists. In this work, we introduce the first protocol, implemented in an R package sigQC, enabling a streamlined methodological and standardised approach for the quality control validation of gene signatures on independent data sets. The emphasis in this work is in showing the critical quality control steps involved in the generation of a clinically and biologically useful, transportable gene signature, including ensuring sufficient expression, variability, and autocorrelation of a signature. We demonstrate the application of the protocol in this work, showing how the outputs created from sigQC may be used for the evaluation of gene signatures on large-scale gene expression data in cancer.

Download Full-text

A New Normalizing Algorithm for BAC CGH Arrays with Quality Control Metrics

Journal of Biomedicine and Biotechnology ◽

10.1155/2011/860732 ◽

2011 ◽

Vol 2011 ◽

pp. 1-15 ◽

Cited By ~ 5

Author(s):

Jeffrey C. Miecznikowski ◽

Daniel P. Gaile ◽

Song Liu ◽

Lori Shepherd ◽

Norma Nowak

Keyword(s):

Quality Control ◽

Comparative Genomic Hybridization ◽

Random Noise ◽

Spatial Location ◽

Artificial Chromosome ◽

Control Measures ◽

Comparative Genomic ◽

Genomic Hybridization ◽

Cancer Dataset ◽

Downstream Analysis

The main focus in pin-tip (or print-tip) microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH) experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose “SmoothArray”, a new method to preprocess comparative genomic hybridization (CGH) bacterial artificial chromosome (BAC) arrays and we show the effects on a cancer dataset. As part of our R software package “aCGHplus,” this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.

Download Full-text

Repeatability and reproducibility assessment in a large-scale population-based microbiota study: case study on human milk microbiota

10.1101/2020.04.20.052035 ◽

2020 ◽

Author(s):

Shirin Moossavi ◽

Kelsey Fehr ◽

Theo J. Moraes ◽

Ehsan Khafipour ◽

Meghan B. Azad

Keyword(s):

Quality Control ◽

Data Structure ◽

Large Scale ◽

Population Based ◽

Microbiome Composition ◽

Scale Population ◽

Microbiome Research ◽

Control Procedures ◽

Repeatability And Reproducibility ◽

Downstream Analysis

AbstractBackgroundQuality control including assessment of batch variabilities and confirmation of repeatability and reproducibility are integral component of high throughput omics studies including microbiome research. Batch effects can mask true biological results and/or result in irreproducible conclusions and interpretations. Low biomass samples in microbiome research are prone to reagent contamination; yet, quality control procedures for low biomass samples in large-scale microbiome studies are not well established.ResultsIn this study we have proposed a framework for an in-depth step-by-step approach to address this gap. The framework consists of three independent stages: 1) verification of sequencing accuracy by assessing technical repeatability and reproducibility of the results using mock communities and biological controls; 2) contaminant removal and batch variability correction by applying a two-tier strategy using statistical algorithms (e.g. decontam) followed by comparison of the data structure between batches; and 3) corroborating the repeatability and reproducibility of microbiome composition and downstream statistical analysis. Using this approach on the milk microbiota data from the CHILD Cohort generated in two batches (extracted and sequenced in 2016 and 2019), we were able to identify potential reagent contaminants that were missed with standard algorithms, and substantially reduce contaminant-induced batch variability. Additionally, we confirmed the repeatability and reproducibility of our reslults in each batch before merging them for downstream analysis.ConclusionThis study provides important insight to advance quality control efforts in low biomass microbiome research. Within-study quality control that takes advantage of the data structure (i.e. differential prevalence of contaminants between batches) would enhance the overall reliability and reproducibility of research in this field.

Download Full-text

Repeatability and reproducibility assessment in a large-scale population-based microbiota study: case study on human milk microbiota

Microbiome ◽

10.1186/s40168-020-00998-4 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Shirin Moossavi ◽

Kelsey Fehr ◽

Ehsan Khafipour ◽

Meghan B. Azad

Keyword(s):

Quality Control ◽

Data Structure ◽

Large Scale ◽

Population Based ◽

Microbiome Composition ◽

Scale Population ◽

Microbiome Research ◽

Control Procedures ◽

Repeatability And Reproducibility ◽

Downstream Analysis

Abstract Background Quality control including assessment of batch variabilities and confirmation of repeatability and reproducibility are integral component of high throughput omics studies including microbiome research. Batch effects can mask true biological results and/or result in irreproducible conclusions and interpretations. Low biomass samples in microbiome research are prone to reagent contamination; yet, quality control procedures for low biomass samples in large-scale microbiome studies are not well established. Results In this study, we have proposed a framework for an in-depth step-by-step approach to address this gap. The framework consists of three independent stages: (1) verification of sequencing accuracy by assessing technical repeatability and reproducibility of the results using mock communities and biological controls; (2) contaminant removal and batch variability correction by applying a two-tier strategy using statistical algorithms (e.g. decontam) followed by comparison of the data structure between batches; and (3) corroborating the repeatability and reproducibility of microbiome composition and downstream statistical analysis. Using this approach on the milk microbiota data from the CHILD Cohort generated in two batches (extracted and sequenced in 2016 and 2019), we were able to identify potential reagent contaminants that were missed with standard algorithms and substantially reduce contaminant-induced batch variability. Additionally, we confirmed the repeatability and reproducibility of our results in each batch before merging them for downstream analysis. Conclusion This study provides important insight to advance quality control efforts in low biomass microbiome research. Within-study quality control that takes advantage of the data structure (i.e. differential prevalence of contaminants between batches) would enhance the overall reliability and reproducibility of research in this field.

Download Full-text

“It’s Been Ugly”: A Large-Scale Qualitative Study into the Difficulties Frontline Doctors Faced across Two Waves of the COVID-19 Pandemic

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph182413067 ◽

2021 ◽

Vol 18 (24) ◽

pp. 13067

Author(s):

Sophie Harris ◽

Elizabeth Jenkinson ◽

Edward Carlton ◽

Tom Roberts ◽

Jo Daniels

Keyword(s):

Content Analysis ◽

Intensive Care ◽

Large Scale ◽

Control Measures ◽

Emotional Impact ◽

Infection Control Measures ◽

Conventional Content Analysis ◽

The Uk ◽

Insight Into

This study aimed to gain an uncensored insight into the most difficult aspects of working as a frontline doctor across successive COVID-19 pandemic waves. Data collected by the parent study (CERA) was analysed using conventional content analysis. Participants comprised frontline doctors who worked in emergency, anaesthetic, and intensive care medicine in the UK and Ireland during the COVID-19 pandemic (n = 1379). All seniority levels were represented, 42.8% of the sample were male, and 69.2% were white. Four themes were identified with nine respective categories (in parentheses): (1) I’m not a COVID hero, I’m COVID cannon fodder (exposed and unprotected, “a kick in the teeth”); (2) the relentlessness and pervasiveness of COVID (“no respite”, “shifting sands”); (3) the ugly truths of the frontline (“inhumane” care, complex team dynamics); (4) an overwhelmed system exacerbated by COVID (overstretched and under-resourced, constant changes and uncertainty, the added hinderance of infection control measures). Findings reflect the multifaceted challenges faced after successive pandemic waves; basic wellbeing needs continue to be neglected and the emotional impact is further pronounced. Steps are necessary to mitigate the repeated trauma exposure of frontline doctors as COVID-19 becomes endemic and health services attempt to recover with inevitable long-term sequelae.

Download Full-text

VariantQC: a visual quality control report for variant evaluation

Bioinformatics ◽

10.1093/bioinformatics/btz560 ◽

2019 ◽

Author(s):

Melissa Y Yan ◽

Betsy Ferguson ◽

Benjamin N Bimber

Keyword(s):

Quality Control ◽

Large Scale ◽

Standard Format ◽

Java Program ◽

Additional Processing ◽

Genomic Studies ◽

Filter Type ◽

Downstream Analysis ◽

High Level ◽

User Friendly

Abstract Summary Large scale genomic studies produce millions of sequence variants, generating datasets far too massive for manual inspection. To ensure variant and genotype data are consistent and accurate, it is necessary to evaluate variants prior to downstream analysis using quality control (QC) reports. Variant call format (VCF) files are the standard format for representing variant data; however, generating summary statistics from these files is not always straightforward. While tools to summarize variant data exist, they generally produce simple text file tables, which still require additional processing and interpretation. VariantQC fills this gap as a user friendly, interactive visual QC report that generates and concisely summarizes statistics from VCF files. The report aggregates and summarizes variants by dataset, chromosome, sample and filter type. The VariantQC report is useful for high-level dataset summary, quality control and helps flag outliers. Furthermore, VariantQC operates on VCF files, so it can be easily integrated into many existing variant pipelines. Availability and implementation DISCVRSeq's VariantQC tool is freely available as a Java program, with the compiled JAR and source code available from https://github.com/BimberLab/DISCVRSeq/. Documentation and example reports are available at https://bimberlab.github.io/DISCVRSeq/.

Download Full-text

decorate: differential epigenetic correlation test

Bioinformatics ◽

10.1093/bioinformatics/btaa067 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2856-2861

Author(s):

Gabriel E Hoffman ◽

Jaroslav Bendl ◽

Kiran Girdhar ◽

Panos Roussos

Keyword(s):

Large Scale ◽

Statistical Tests ◽

Computational Cost ◽

R Package ◽

Supplementary Information ◽

Expression Data ◽

Correlation Test ◽

Disease Biology ◽

Genome Wide ◽

Insight Into

Abstract Motivation Identifying correlated epigenetic features and finding differences in correlation between individuals with disease compared to controls can give novel insight into disease biology. This framework has been successful in analysis of gene expression data, but application to epigenetic data has been limited by the computational cost, lack of scalable software and lack of robust statistical tests. Results Decorate, differential epigenetic correlation test, identifies correlated epigenetic features and finds clusters of features that are differentially correlated between two or more subsets of the data. The software scales to genome-wide datasets of epigenetic assays on hundreds of individuals. We apply decorate to four large-scale datasets of DNA methylation, ATAC-seq and histone modification ChIP-seq. Availability and implementation decorate R package is available from https://github.com/GabrielHoffman/decorate. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text