complete genomics
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 1)

H-INDEX

4
(FIVE YEARS 0)

Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 973
Author(s):  
Enrique Canessa

A signal analysis of the complete genome sequenced for coronavirus variants of concern—B.1.1.7 (Alpha), B.1.135 (Beta) and P1 (Gamma)—and coronavirus variants of interest—B.1.429–B.1.427 (Epsilon) and B.1.525 (Eta)—is presented using open GISAID data. We deal with a certain new type of finite alternating sum series having independently distributed terms associated with binary (0,1) indicators for the nucleotide bases. Our method provides additional information to conventional similarity comparisons via alignment methods and Fourier Power Spectrum approaches. It leads to uncover distinctive patterns regarding the intrinsic data organization of complete genomics sequences according to its progression along the nucleotide bases position. The present new method could be useful for the bioinformatics surveillance and dynamics of coronavirus genome variants.



2020 ◽  
Author(s):  
Susanne Gerber ◽  
Stephan Weißbach ◽  
Stanislav Jur`Evic Sys ◽  
Charlotte Hewel ◽  
Hristo Todorov ◽  
...  

Abstract Background Next Generation Sequencing (NGS) is the fundament of various studies providing insights into questions from biology and medicine. Nevertheless, integrating data from different experimental backgrounds can introduce strong biases. In order to methodically investigate the magnitude of systematic errors, we performed a cross-sectional observational study on a genomic cohort of 99 subjects each sequenced via (i) Illumina HiSeq X, (ii) Illumina HiSeq and (iii) Complete Genomics. Consequently, we systematically analyzed the heterogeneity between the sequencing cohorts with respect to genomic annotation and common filter criteria like minimum allele frequency (MAF). Results The number of detected variants/variant classes per individual was highly dependent on the sequencing technology. We observed a statistically significant overrepresentation of variants uniquely called by a single platform which indicates potential systematic biases. These variants were enriched in low complexity genomic regions and simple repeats. Furthermore, estimates of allele frequency were highly discrepant for a subset of variants in pairwise comparisons between different sequencing platforms. Applying common filters – such as MAF 5% and HWE- greatly reduced the heterogeneity between cohorts but still left discrepancies of several thousand variants after filtering.Conclusion We provide empirical evidence of systematic heterogeneity in variant calls between alternative experimental and data analysis setups. Our results highlight the potential benefit of reprocessing genomic data with harmonized pipelines when integrating data from different studies.



2020 ◽  
Author(s):  
Keyword(s):  


2018 ◽  
Vol 64 (4) ◽  
pp. 715-725 ◽  
Author(s):  
Qing Mao ◽  
Robert Chin ◽  
Weiwei Xie ◽  
Yuqing Deng ◽  
Wenwei Zhang ◽  
...  

Abstract BACKGROUND Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. METHODS cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. RESULTS Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 (CHD8) and LDL receptor-related protein 1 (LRP1), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. CONCLUSIONS We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures.



Blood ◽  
2017 ◽  
Vol 130 (Suppl_1) ◽  
pp. SCI-5-SCI-5
Author(s):  
Elaine Lyon

Abstract The rapid evolution of genomic testing gives new meaning to the term "high-complexity testing." Variant classification is clearly challenging, but with the aid of American College of Medical Genetics and Genomics (ACMG) and Association for Molecular Pathology (AMP) professional guidelines for combining evidence used in conjunction with national efforts, laboratories may be better able to standardize and establish quality metrics. Additionally, guidelines for evaluating types of evidence and interpreting sequence variations have been developed. This session will review these guidelines to address expectations of data quality, as well as elements of reporting identified variants and their interpretations. Efforts by the molecular community to address consistency in using these guidelines will also be shown. The objectives of this presentation are to list technical guidelines for genomic sequencing to ensure quality data, to describe evidence used to classify variants, and to identify variability in how evidences are used. Disclosures Lyon: Complete Genomics: Consultancy; Korean Society for Medical and Molecular Genetics: Honoraria; Association for Molecular Pathology: Honoraria; American College of Medical Genetics and Genomics: Membership on an entity's Board of Directors or advisory committees.



2017 ◽  
Author(s):  
Todd Lencz ◽  
Jin Yu ◽  
Cameron Palmer ◽  
Shai Carmi ◽  
Danny Ben-Avraham ◽  
...  

AbstractBackgroundWhile increasingly large reference panels for genome-wide imputation have been recently made available, the degree to which imputation accuracy can be enhanced by population-specific reference panels remains an open question. In the present study, we sequenced at full-depth (≥30x) a moderately large (n=738) cohort of samples drawn from the Ashkenazi Jewish population across two platforms (Illumina X Ten and Complete Genomics, Inc.). We developed and refined a series of quality control steps to optimize sensitivity, specificity, and comprehensiveness of variant calls in the reference panel, and then tested the accuracy of imputation against target cohorts drawn from the same population.ResultsFor samples sequenced on the Illumina X Ten platform, quality thresholds were identified that permitted highly accurate calling of single nucleotide variants across 94% of the genome. The Complete Genomics, Inc. platform was more conservative (fewer variants called) compared to the Illumina platform, but also demonstrated relatively greater numbers of false positives that needed to be filtered. Quality control procedures also permitted detection of novel genome reads that are not mapped to current reference or alternate assemblies. After stringent quality control, the population-specific reference panel produced more accurate and comprehensive imputation results relative to publicly available, large cosmopolitan reference panels. The population-specific reference panel also permitted enhanced filtering of clinically irrelevant variants from personal genomes.ConclusionsOur primary results demonstrate enhanced accuracy of a population-specific imputation panel relative to cosmopolitan panels, especially in the range of infrequent (<5% non-reference allele frequency) and rare (<1% non-reference allele frequency) variants that may be most critical to further progress in mapping of complex phenotypes.



2017 ◽  
Author(s):  
James P. Evans ◽  
Rajesh Patidar ◽  
Zalman Vaksman ◽  
Sivasish Sindiri ◽  
Douglas R. Stewart ◽  
...  


2017 ◽  
Author(s):  
Daniel K. Putnam ◽  
Ma Xiaotu ◽  
Stephen V. Rice ◽  
Yu Liu ◽  
Jinghui Zhang ◽  
...  

AbstractVCF2CNA is a web interface tool for copy-number alteration (CNA) analysis of VCF and other variant file formats. We applied it to 46 adult glioblastoma and 146 pediatric neuroblastoma samples sequenced by Illumina and Complete Genomics (CGI) platforms respectively. VCF2CNA was highly consistent with a state-of-the-art algorithm using raw sequencing data (mean F1-score=0.994) in high-quality glioblastoma samples and was robust to uneven coverage introduced by library artifacts. In the neuroblastoma set, VCF2CNA identified MYCN high-level amplifications in 31 of 32 clinically validated samples compared to 15 found by CGI’s HMM-based CNA model. The findings suggest that VCF2CNA is an accurate, efficient and platform-independent tool for CNA analyses without accessing raw sequence data.



2016 ◽  
Author(s):  
Dean Bobo ◽  
Mikhail Lipatov ◽  
Juan L. Rodriguez-Flores ◽  
Adam Auton ◽  
Brenna M. Henn

AbstractShort-read, next-generation sequencing (NGS) is now broadly used to identify rare or de novo mutations in population samples and disease cohorts. However, NGS data is known to be error-prone and post-processing pipelines have primarily focused on the removal of spurious mutations or “false positives” for downstream genome datasets. Less attention has been paid to characterizing the fraction of missing mutations or “false negatives” (FN). Here we interrogate several publically available human NGS autosomal variant datasets using corresponding Sanger sequencing as a truth-set. We examine both low-coverage Illumina and high-coverage Complete Genomics genomes. We show that the FN rate varies between 3%-18% and that false-positive rates are considerably lower (<3%) for publically available human genome callsets like 1000 Genomes. The FN rate is strongly dependent on calling pipeline parameters, as well as read coverage. Our results demonstrate that missing mutations are a significant feature of genomic datasets and imply additional fine-tuning of bioinformatics pipelines is needed. To address this, we design a phylogeny-aware tool [PhyloFaN] which can be used to quantify the FN rate for haploid genomic experiments, without additional generation of validation data. Using PhyloFaN on ultra-high coverage NGS data from both Illumina HiSeq and Complete Genomics platforms derived from the 1000 Genomes Project, we characterize the false negative rate in human mtDNA genomes. The false negative rate for the publically available mtDNA callsets is 17-20%, even for extremely high coverage haploid data.



GigaScience ◽  
2016 ◽  
Vol 5 (1) ◽  
Author(s):  
Serban Ciotlos ◽  
Qing Mao ◽  
Rebecca Yu Zhang ◽  
Zhenyu Li ◽  
Robert Chin ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document