calling algorithm
Recently Published Documents


TOTAL DOCUMENTS

44
(FIVE YEARS 16)

H-INDEX

11
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Melivoia Rapti ◽  
Jenny Meylan Merlini ◽  
Emmanuelle Ranza ◽  
Stylianos E. Antonarakis ◽  
Federico A. Santoni

CoverageMaster (CoM) is a Copy Number Variation (CNV) calling algorithm based on depth-of-coverage maps designed to detect CNVs of any size in exome (WES) and genome (WGS) data. The core of the algorithm is the compression of sequencing coverage data in a multiscale Wavelet space and the analysis through an iterative Hidden Markov Model (HMM). CoM processes WES and WGS data at nucleotide scale resolution and accurately detect and visualize full size range CNVs, including single or partial exon deletions and duplications. The results obtained with this approach support the possibility for coverage-based CNV callers to replace probe-based methods such array CGH and MLPA in the near future.


2021 ◽  
Author(s):  
Sajad Moshkelgosha ◽  
Allen Duong ◽  
Gavin Wilson ◽  
Tallulah Andrews ◽  
Gregory Berra ◽  
...  

Lung transplant (LT) recipients experience episodes of immune-mediated acute lung allograft dysfunction (ALAD). ALAD episodes are a risk factor for chronic lung allograft dysfunction (CLAD), the major cause of death after LT. We have applied single-cell RNA sequencing (scRNAseq) to bronchoalveolar lavage (BAL) cells from stable and ALAD patients and to cells from explanted CLAD lung tissue to determine key cellular elements in dysfunctional lung allografts, with a focus on macrophages. We identified two alveolar macrophage (AM) subsets uniquely represented in ALAD. Using pathway analysis and differentially expressed genes, we annotated these as pro-inflammatory interferon-stimulated gene (ISG) and metallothionein-mediated inflammatory (MT) AMs. Functional analysis of an independent set of AMs in vitro revealed that ALAD AMs exhibited a higher expression of CXCL10, a marker of ISG AMs, and increased secretion of pro-inflammatory cytokines compared to AMs from stable patients. Using publicly available BAL scRNAseq datasets, we found that ISG and MT AMs are associated with more severe inflammation in COVID-19 patients. Analysis of cells from four explanted CLAD lungs revealed similar macrophage populations. Using a single nucleotide variation calling algorithm, we also demonstrate contributions of donor and recipient cells to all AM subsets early post-transplant, with loss of donor-derived cells over time. Our data reveals extensive heterogeneity among lung macrophages after LT and indicates that specific sub-populations may be associated with allograft dysfunction, raising the possibility that these cells may represent important therapeutic targets.


2021 ◽  
Author(s):  
Wan-Ping Lee ◽  
Qihui Zhu ◽  
Xiaofei Yang ◽  
Silvia Liu ◽  
Eliza Cerveira ◽  
...  

We aimed to develop a whole genome sequencing (WGS)-based copy number variant (CNV) calling algorithm with the potential of replacing chromosomal microarray assay (CMA) for clinical diagnosis. JAX-CNV is thus developed for CNV detection from WGS. The performance of this CNV calling algorithm was evaluated in a blinded manner on 31 samples and compared to the results of clinically-validated CMAs. Comparing to 112 CNVs reported by clinically-validated CMAs of the 31 samples, JAX-CNV is 100% recalling them. Besides, JAX-CNV identified an average of 30 CNVs per individual that is an approximately seven-fold increase compared to calls of clinically-validated CMAs. Experimental validation of 24 randomly selected CNVs, showed one false positive (i.e., a false discovery rate of 4.17%). A robustness test on lower-coverage data revealed a 100% sensitivity for CNVs greater than 300 kb (the current threshold for College of American Pathologists) down to 10x coverage. For CNVs greater than 50 kb, sensitivities were 100% for coverages deeper than 20x, 97% for 15x, and 95% for 10x. We developed a WGS-based CNV pipeline, including this newly developed CNV caller JAX-CNV, and found it capable of detecting CMA reported CNVs at 100% sensitivity with about 4% false discovery rate. We propose that JAX-CNV could be further examined in a multi-institutional study to justify the transition of first-tier genetic testing from CMAs to WGS. JAX-CNV is available on https://github.com/TheJacksonLaboratory/JAX-CNV.


2021 ◽  
Author(s):  
Sajad Moshkelgosha ◽  
Gavin Wilson ◽  
Allen Duong ◽  
Tallulah Andrews ◽  
Gregory Berra ◽  
...  

AbstractPurposeLung transplant (LT) recipients experience episodes of immune-mediated acute lung allograft dysfunction (ALAD). We have applied single-cell RNA sequencing (scRNAseq) to bronchoalveolar lavage (BAL) cells of stable and ALAD patients to determine key cellular elements in dysfunctional lung allografts. Our particular focus here is on studying alveolar macrophages (AMs) as scRNAseq enables us to elucidate their heterogeneity and possible association with ALAD where our knowledge from cytometry-based assays is very limited.MethodsFresh bronchoalveolar lavage (BAL) cells from 6 LT patients, 3 with stable lung function (3044 ± 1519 cells) and 3 undergoing an episode of ALAD (2593 ± 904 cells) were used for scRNAseq. R Bioconductor and Seurat were used to perform QC, dimensionality reduction, annotation, pathway analysis, and trajectory. Donor and recipient deconvolution was performed using single nucleotide variations.ResultsOur data revealed that AMs are highly heterogeneous (12 transcriptionally distinct subsets in stable). We identified two AM subsets uniquely represented in ALAD. Based on pathway analysis and the top differentially expressed genes in BAL we annotated them as pro-inflammatory interferon-stimulated genes (ISG) and metallothioneins-mediated inflammatory (MT). Pseudotime analysis suggested that ISG AMs represent an earlier stage of differentiation which may suggest them as monocyte drive macrophages. Our functional analysis on an independent set of BAL samples shows that ALAD samples have significantly higher expression of CXCL10, a marker of ISG AM, as we as higher secretion of pro-inflammatory cytokines. Single nucleotide variation calling algorithm has allowed us to identify macrophages of donor origin and demonstrated that donor AMs are lost with time post-transplant.ConclusionUsing scRNAseq, we observed AMs heterogeneity and identified specific subsets that may be associated with allograft dysfunction. Further exploration with scRNAseq will shed light on LT immunobiology and the role of AMs in allograft injury and dysfunction.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Aseel Awdeh ◽  
Marcel Turcotte ◽  
Theodore J. Perkins

Abstract Background Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating “smart” controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results. Result We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses. Conclusions This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls.


2020 ◽  
Vol 36 (12) ◽  
pp. 3625-3631
Author(s):  
Chenfu Shi ◽  
Magnus Rattray ◽  
Gisela Orozco

Abstract Motivation HiChIP is a powerful tool to interrogate 3D chromatin organization. Current tools to analyse chromatin looping mechanisms using HiChIP data require the identification of loop anchors to work properly. However, current approaches to discover these anchors from HiChIP data are not satisfactory, having either a very high false discovery rate or strong dependence on sequencing depth. Moreover, these tools do not allow quantitative comparison of peaks across different samples, failing to fully exploit the information available from HiChIP datasets. Results We develop a new tool based on a representation of HiChIP data centred on the re-ligation sites to identify peaks from HiChIP datasets, which can subsequently be used in other tools for loop discovery. This increases the reliability of these tools and improves recall rate as sequencing depth is reduced. We also provide a method to count reads mapping to peaks across samples, which can be used for differential peak analysis using HiChIP data. Availability and implementation HiChIP-Peaks is freely available at https://github.com/ChenfuShi/HiChIP_peaks. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
leilei wu ◽  
Qinfang Deng ◽  
Ze Xu ◽  
Songwen Zhou ◽  
Chao Li ◽  
...  

Abstract Background Hybrid capture-based next-generation sequencing of DNA has been widely applied in the detection of circulating tumor DNA (ctDNA). Various methods have been proposed for ctDNA detection, but low-allelic-fraction (AF) variants are still a great challenge. In addition, no panel-wide calling algorithm is available, which hiders the full usage of ctDNA based ‘liquid biopsy’. Thus, we developed the VBCALAVD (Virtual Barcode-based Calling Algorithm for Low Allelic Variant Detection) i n silico to overcome these limitations. Results Based on the understanding of the nature of ctDNA fragmentation, a novel platform-independent virtual barcode strategy was established to eliminate random sequencing errors by clustering sequencing reads into virtual families.Stereotypical mutant-family-level background artifacts were polished by constructing AF distributions. Three additional robust fine-tuning filters were obtained to eliminate stochastic mutant-family-level noises. The performance of our algorithm was validated using cell-free DNA reference standard samples (cfDNA RSDs) and normal healthy cfDNA samples (cfDNA controls). For the RSDs with AFs of 0.1%, 0.2%, 0.5%, 1% and 5%, the mean F1 scores were 0.43 (0.25~0.56), 0.77, 0.92, 0.926 (0.86~1.0) and 0.89 (0.75~1.0), respectively, which indicates that the proposed approach significantly outperforms the published algorithms. Among controls, no false positives were detected. Meanwhile, characteristics of mutant-family-level noise and quantitative determinants of divergence between mutant-family-level noises from controls and RSDs were clearly depicted. Conclusions Due to its good performance in the detection of low-AF variants, our algorithm will greatly facilitate the noninvasive panel-wide detection of ctDNA in research and clinical settings. The whole pipeline is available at https://github.com/zhaodalv/VBCALAVD.


2020 ◽  
Author(s):  
leilei wu ◽  
Qinfang Deng ◽  
Ze Xu ◽  
Songwen Zhou ◽  
Chao Li ◽  
...  

Abstract Background Hybrid capture-based next-generation sequencing of DNA has been widely applied in the detection of circulating tumor DNA (ctDNA). Various methods have been proposed for ctDNA detection, but low-allelic-fraction (AF) variants are still a great challenge. In addition, no panel-wide calling algorithm is available, which hiders the full usage of ctDNA based ‘liquid biopsy’. Thus, we developed the VBCALAVD (Virtual Barcode-based Calling Algorithm for Low Allelic Variant Detection) i n silico to overcome these limitations. Results Based on the understanding of the nature of ctDNA fragmentation, a novel platform-independent virtual barcode strategy was established to eliminate random sequencing errors by clustering sequencing reads into virtual families.Stereotypical mutant-family-level background artifacts were polished by constructing AF distributions. Three additional robust fine-tuning filters were obtained to eliminate stochastic mutant-family-level noises. The performance of our algorithm was validated using cell-free DNA reference standard samples (cfDNA RSDs) and normal healthy cfDNA samples (cfDNA controls). For the RSDs with AFs of 0.1%, 0.2%, 0.5%, 1% and 5%, the mean F1 scores were 0.43 (0.25~0.56), 0.77, 0.92, 0.926 (0.86~1.0) and 0.89 (0.75~1.0), respectively, which indicates that the proposed approach significantly outperforms the published algorithms. Among controls, no false positives were detected. Meanwhile, characteristics of mutant-family-level noise and quantitative determinants of divergence between mutant-family-level noises from controls and RSDs were clearly depicted. Conclusions Due to its good performance in the detection of low-AF variants, our algorithm will greatly facilitate the noninvasive panel-wide detection of ctDNA in research and clinical settings. The whole pipeline is available at https://github.com/zhaodalv/VBCALAVD.


2020 ◽  
Vol 14 ◽  
pp. 117793222093806
Author(s):  
Venkat S. Malladi ◽  
Anusha Nagari ◽  
Hector L Franco ◽  
W Lee Kraus

The differentiation of embryonic stem cells into various lineages is highly dependent on the chromatin state of the genome and patterns of gene expression. To identify lineage-specific enhancers driving the differentiation of progenitors into pancreatic cells, we used a previously described computational framework called Total Functional Score of Enhancer Elements (TFSEE), which integrates multiple genomic assays that probe both transcriptional and epigenomic states. First, we evaluated and compared TFSEE as an enhancer-calling algorithm with enhancers called using GRO-seq-defined enhancer transcripts (method 1) versus enhancers called using histone modification ChIP-seq data (method 2). Second, we used TFSEE to define the enhancer landscape and identify transcription factors (TFs) that maintain the multipotency of a subpopulation of endodermal stem cells during differentiation into pancreatic lineages. Collectively, our results demonstrate that TFSEE is a robust enhancer-calling algorithm that can be used to perform multilayer genomic data integration to uncover cell type-specific TFs that control lineage-specific enhancers.


Sign in / Sign up

Export Citation Format

Share Document