scholarly journals Crambled: A Shiny application to enable intuitive resolution of conflicting cellularity estimates

F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 1407 ◽  
Author(s):  
Andy G. Lynch

It is now commonplace to investigate tumour samples using whole-genome sequencing, and some commonly performed tasks are the estimation of cellularity (or sample purity), the genome-wide profiling of copy numbers, and the assessment of sub-clonal behaviours. Several tools are available to undertake these tasks, but often give conflicting results – not least because there is often genuine uncertainty due to a lack of model identifiability. Presented here is a tool, "Crambled", that allows for an intuitive visual comparison of the conflicting solutions. Crambled is implemented as a Shiny application within R, and is accompanied by example images from two use cases (one tumour sample with matched normal sequencing, and one standalone cell line example) as well as functions to generate the necessary images from any sequencing data set. Through the use of Crambled, a user may gain insight into why each tool has offered its given solution and combined with a knowledge of the disease being studied can choose between the competing solutions in an informed manner.

2015 ◽  
Author(s):  
Laura T Jiménez-Barrón ◽  
Jason A O'Rawe ◽  
Yiyang Wu ◽  
Margaret Yoon ◽  
Han Fang ◽  
...  

Autism spectrum disorders (ASD) are a group of developmental disabilities that affect social interaction, communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASD, in which many different loci are involved. Although many current population scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de-novo, autosomal recessive, x-linked, mitochondrial and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous CNVs, a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole genome sequencing data can generate reliable results for use in downstream investigations. We are moving to implement our framework for the analysis and study of larger cohorts of families, where statistical rigor can accompany genetic findings.


2019 ◽  
Vol 57 (6) ◽  
Author(s):  
R. C. Jones ◽  
L. G. Harris ◽  
S. Morgan ◽  
M. C. Ruddy ◽  
M. Perry ◽  
...  

ABSTRACT An inability to standardize the bioinformatic data produced by whole-genome sequencing (WGS) has been a barrier to its widespread use in tuberculosis phylogenetics. The aim of this study was to carry out a phylogenetic analysis of tuberculosis in Wales, United Kingdom, using Ridom SeqSphere software for core genome multilocus sequence typing (cgMLST) analysis of whole-genome sequencing data. The phylogenetics of tuberculosis in Wales have not previously been studied. Sixty-six Mycobacterium tuberculosis isolates (including 42 outbreak-associated isolates) from south Wales were sequenced using an Illumina platform. Isolates were assigned to principal genetic groups, single nucleotide polymorphism (SNP) cluster groups, lineages, and sublineages using SNP-calling protocols. WGS data were submitted to the Ridom SeqSphere software for cgMLST analysis and analyzed alongside 179 previously lineage-defined isolates. The data set was dominated by the Euro-American lineage, with the sublineage composition being dominated by T, X, and Haarlem family strains. The cgMLST analysis successfully assigned 58 isolates to major lineages, and the results were consistent with those obtained by traditional SNP mapping methods. In addition, the cgMLST scheme was used to resolve an outbreak of tuberculosis occurring in the region. This study supports the use of a cgMLST method for standardized phylogenetic assignment of tuberculosis isolates and for outbreak resolution and provides the first insight into Welsh tuberculosis phylogenetics, identifying the presence of the Haarlem sublineage commonly associated with virulent traits.


PLoS ONE ◽  
2019 ◽  
Vol 14 (12) ◽  
pp. e0225848
Author(s):  
Jérôme Ambroise ◽  
Léonid M. Irenge ◽  
Jean-François Durant ◽  
Bertrand Bearzatto ◽  
Godfrey Bwire ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Nemat Hedayat-Evrigh ◽  
Reza Khalkhali-Evrigh ◽  
Mohammad Reza Bakhtiarizadeh

The population size of Bactrian camels is smaller than dromedary, and they are distributed in cold and mountain regions and are also at the risk of extinction in some countries such as Iran. To identify and investigate the genome-wide variations, whole-genome sequencing of two Iranian Bactrian camels were performed with 37.4- and 42.6-fold coverage for the first time. Along with Iranian Bactrian camels, sequencing data from two Mongolian domestic and two wild Bactrian camels deposited in the NCBI were reanalyzed. The analysis eventuated to the identification of 4,908,998, 4,485,725, and 4,706,654 SNPs for Iranian, Mongolian domestic, and wild Bactrian camels, respectively. Also, INDEL variations ranged from 358,311 to 533,188 in all six camels. Results of variants annotation in all samples revealed that more than 88 percent of SNPs and INDELs were located in the intergenic and intronic regions. We found that 800,530 SNPs were common among all studied camels, containing 4,046 missense variants that affected 2,428 genes. Investigation of common genes among all camels containing the missense SNPs showed that there are 98 zinc finger and 4 fertility-related genes (ZP1, ZP2, ZP4, and ZPBP) in this set.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Robert P. Adelson ◽  
Alan E. Renton ◽  
Wentian Li ◽  
Nir Barzilai ◽  
Gil Atzmon ◽  
...  

Abstract The success of next-generation sequencing depends on the accuracy of variant calls. Few objective protocols exist for QC following variant calling from whole genome sequencing (WGS) data. After applying QC filtering based on Genome Analysis Tool Kit (GATK) best practices, we used genotype discordance of eight samples that were sequenced twice each to evaluate the proportion of potentially inaccurate variant calls. We designed a QC pipeline involving hard filters to improve replicate genotype concordance, which indicates improved accuracy of genotype calls. Our pipeline analyzes the efficacy of each filtering step. We initially applied this strategy to well-characterized variants from the ClinVar database, and subsequently to the full WGS dataset. The genome-wide biallelic pipeline removed 82.11% of discordant and 14.89% of concordant genotypes, and improved the concordance rate from 98.53% to 99.69%. The variant-level read depth filter most improved the genome-wide biallelic concordance rate. We also adapted this pipeline for triallelic sites, given the increasing proportion of multiallelic sites as sample sizes increase. For triallelic sites containing only SNVs, the concordance rate improved from 97.68% to 99.80%. Our QC pipeline removes many potentially false positive calls that pass in GATK, and may inform future WGS studies prior to variant effect analysis.


Sign in / Sign up

Export Citation Format

Share Document