scholarly journals A comparison of BeadChip and WGS genotyping outputs using partial validation by sanger sequencing

BMC Genomics ◽  
2020 ◽  
Vol 21 (S7) ◽  
Author(s):  
Kirill A. Danilov ◽  
Dimitri A. Nikogosov ◽  
Sergey V. Musienko ◽  
Ancha V. Baranova

Abstract Background Head-to-head comparison of BeadChip and WGS/WES genotyping techniques for their precision is far from straightforward. A tool for validation of high-throughput genotyping calls such as Sanger sequencing is neither scalable nor practical for large-scale DNA processing. Here we report a cross-validation analysis of genotyping calls obtained via Illumina GSA BeadChip and WGS (Illumina HiSeq X Ten) techniques. Results When compared to each other, the average precision and accuracy of BeadChip and WGS genotyping techniques exceeded 0.991 and 0.997, respectively. The average fraction of discordant variants for both platforms was found to be 0.639%. A sliding window approach was utilized to explore genomic regions not exceeding 500 bp encompassing a maximal amount of discordant variants for further validation by Sanger sequencing. Notably, 12 variants out of 26 located within eight identified regions were consistently discordant in related calls made by WGS and BeadChip. When Sanger sequenced, a total of 16 of these genotypes were successfully resolved, indicating that a precision of WGS and BeadChip genotyping for this genotype subset was at 0.81 and 0.5, respectively, with accuracy values of 0.87 and 0.61. Conclusions We conclude that WGS genotype calling exhibits higher overall precision within the selected variety of discordantly genotyped variants, though the amount of validated variants remained insufficient.

2020 ◽  
Vol 5 (4) ◽  
pp. 416-421
Author(s):  
Fang Fang ◽  
Zhe Xu ◽  
Yue Suo ◽  
Hui Wang ◽  
Si Cheng ◽  
...  

BackgroundMendelian stroke causes nearly 7% of ischaemic strokes and is also an important aetiology of cryptogenic stroke. Identifying the genetic abnormalities in Mendelian strokes is important as it would facilitate therapeutic management and genetic counselling. Next-generation sequencing makes large-scale sequencing and genetic testing possible.MethodsA systematic literature search was conducted to identify causal genes of Mendelian strokes, which were used to construct a hybridization-based gene capture panel. Genetic variants for target genes were detected using Illumina HiSeq X10 and the Novaseq platform. The sensitivity and specificity were evaluated by comparing the results with Sanger sequencing.Results53 suspected patients of Mendelian strokes were analysed using the panel of 181 causal genes. According to the American College of Medical Genetics and Genomics standard, 16 likely pathogenic/variants of uncertain significance genetic variants were identified. Diagnostic testing was conducted by comparing the consistency between the results of panel and Sanger sequencing. Both the sensitivity and specificity were 100% for the panel.ConclusionThis panel provides an economical, time-saving and labour-saving method to detect causal mutations of Mendelian strokes.


2021 ◽  
Vol 11 ◽  
Author(s):  
Stephanie van Wyk ◽  
Brenda D. Wingfield ◽  
Lieschen De Vos ◽  
Nicolaas A. van der Merwe ◽  
Emma T. Steenkamp

The Repeat-Induced Point (RIP) mutation pathway is a fungus-specific genome defense mechanism that mitigates the deleterious consequences of repeated genomic regions and transposable elements (TEs). RIP mutates targeted sequences by introducing cytosine to thymine transitions. We investigated the genome-wide occurrence and extent of RIP with a sliding-window approach. Using genome-wide RIP data and two sets of control groups, the association between RIP, TEs, and GC content were contrasted in organisms capable and incapable of RIP. Based on these data, we then set out to determine the extent and occurrence of RIP in 58 representatives of the Ascomycota. The findings were summarized by placing each of the fungi investigated in one of six categories based on the extent of genome-wide RIP. In silico RIP analyses, using a sliding-window approach with stringent RIP parameters, implemented simultaneously within the same genetic context, on high quality genome assemblies, yielded superior results in determining the genome-wide RIP among the Ascomycota. Most Ascomycota had RIP and these mutations were particularly widespread among classes of the Pezizomycotina, including the early diverging Orbiliomycetes and the Pezizomycetes. The most extreme cases of RIP were limited to representatives of the Dothideomycetes and Sordariomycetes. By contrast, the genomes of the Taphrinomycotina and Saccharomycotina contained no detectable evidence of RIP. Also, recent losses in RIP combined with controlled TE proliferation in the Pezizomycotina subphyla may promote substantial genome enlargement as well as the formation of sub-genomic compartments. These findings have broadened our understanding of the taxonomic range and extent of RIP in Ascomycota and how this pathway affects the genomes of fungi harboring it.


2018 ◽  
Author(s):  
Alexander Wu ◽  
Tengfei Xiao ◽  
Teng Fei ◽  
X. Shirley Liu ◽  
Wei Li

AbstractCRISPR/Cas9 knockout screens have been widely used to interrogate gene functions across a wide range of cell systems. However, the screening outcome is biased in amplified genomic regions, due to the ability of the Cas9 nuclease to induce multiple double-stranded breaks and strong DNA damage responses at these regions. We developed algorithms to correct biases associated with copy number variations (CNV), even when the CNV profiles are unknown. We demonstrated that our methods effectively reduced false positives in amplified regions while preserving signals of true positives. In addition, we developed a sliding window approach to estimate regions of high copy numbers for cases in which CNV information is not available. These copy number estimations can subsequently be used to effectively correct CNV-related biases in CRISPR screening experiments. Our approach is integrated into the existing MAGeCK/MAGeCK-VISPR analysis pipelines and provides a convenient framework to improve the precision of CRISPR screening results.


Author(s):  
Jianglin Feng ◽  
Nathan C Sheffield

Abstract Summary Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions. Availability https://github.com/databio/IGD


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Martin Johnsson ◽  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
Gregor Gorjanc ◽  
Ching-Yi Chen ◽  
...  

Abstract Background Meiotic recombination results in the exchange of genetic material between homologous chromosomes. Recombination rate varies between different parts of the genome, between individuals, and is influenced by genetics. In this paper, we assessed the genetic variation in recombination rate along the genome and between individuals in the pig using multilocus iterative peeling on 150,000 individuals across nine genotyped pedigrees. We used these data to estimate the heritability of recombination and perform a genome-wide association study of recombination in the pig. Results Our results confirmed known features of the recombination landscape of the pig genome, including differences in genetic length of chromosomes and marked sex differences. The recombination landscape was repeatable between lines, but at the same time, there were differences in average autosome-wide recombination rate between lines. The heritability of autosome-wide recombination rate was low but not zero (on average 0.07 for females and 0.05 for males). We found six genomic regions that are associated with recombination rate, among which five harbour known candidate genes involved in recombination: RNF212, SHOC1, SYCP2, MSH4 and HFM1. Conclusions Our results on the variation in recombination rate in the pig genome agree with those reported for other vertebrates, with a low but nonzero heritability, and the identification of a major quantitative trait locus for recombination rate that is homologous to that detected in several other species. This work also highlights the utility of using large-scale livestock data to understand biological processes.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Abhishek Uday Patil ◽  
Sejal Ghate ◽  
Deepa Madathil ◽  
Ovid J. L. Tzeng ◽  
Hsu-Wen Huang ◽  
...  

AbstractCreative cognition is recognized to involve the integration of multiple spontaneous cognitive processes and is manifested as complex networks within and between the distributed brain regions. We propose that the processing of creative cognition involves the static and dynamic re-configuration of brain networks associated with complex cognitive processes. We applied the sliding-window approach followed by a community detection algorithm and novel measures of network flexibility on the blood-oxygen level dependent (BOLD) signal of 8 major functional brain networks to reveal static and dynamic alterations in the network reconfiguration during creative cognition using functional magnetic resonance imaging (fMRI). Our results demonstrate the temporal connectivity of the dynamic large-scale creative networks between default mode network (DMN), salience network, and cerebellar network during creative cognition, and advance our understanding of the network neuroscience of creative cognition.


2006 ◽  
Vol 04 (03) ◽  
pp. 639-647 ◽  
Author(s):  
ELEAZAR ESKIN ◽  
RODED SHARAN ◽  
ERAN HALPERIN

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .


Genes ◽  
2018 ◽  
Vol 9 (10) ◽  
pp. 500
Author(s):  
Juan A. Subirana ◽  
Xavier Messeguer

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.


Sign in / Sign up

Export Citation Format

Share Document