scholarly journals RelocaTE2: a high resolution transposable element polymorphism mapping tool for population resequencing

Author(s):  
Jinfeng Chen ◽  
Travis Wrightsman ◽  
Susan R Wessler ◽  
Jason E. Stajich

Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE transposition events or polymorphisms can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 ( http://github.com/stajichlab/RelocaTE2 ) for identification of TE polymorphisms at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrates a higher level of sensitivity and specificity when compared to other tools. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE polymorphisms and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.

2016 ◽  
Author(s):  
Jinfeng Chen ◽  
Travis Wrightsman ◽  
Susan R Wessler ◽  
Jason E. Stajich

Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 ( http://github.com/stajichlab/RelocaTE2 ) for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.


2016 ◽  
Author(s):  
Jinfeng Chen ◽  
Travis Wrightsman ◽  
Susan R Wessler ◽  
Jason E. Stajich

Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 ( http://github.com/stajichlab/RelocaTE2 ) for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e2942 ◽  
Author(s):  
Jinfeng Chen ◽  
Travis R. Wrightsman ◽  
Susan R. Wessler ◽  
Jason E. Stajich

Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.


2021 ◽  
Author(s):  
Chamteut Oh ◽  
Palash Sashittal ◽  
Aijia Zhou ◽  
Leyi Wang ◽  
Mohammed El-Kebir ◽  
...  

Monitoring the prevalence of SARS-CoV-2 variants is necessary to make informed public health decisions during the COVID-19 pandemic. PCR assays have received global attention, facilitating rapid understanding of variant dynamics because they are more accessible and scalable than genome sequencing. However, as PCR assays target only a few mutations, their accuracy could be compromised when these mutations are not exclusive to target variants. Here we show how to design variant-specific PCR assays with high sensitivity and specificity across different geographical regions by incorporating sequences deposited in the GISAID database. Furthermore, we demonstrate that several previously developed PCR assays have decreased accuracy outside their study areas. We introduce PRIMES, an algorithm that enables the design of reliable PCR assays, as demonstrated in our experiments that enabled tracking of dominant SARS-CoV-2 variants in local sewage samples. Our findings will contribute to improving PCR assays for SARS-CoV-2 variant surveillance.


Author(s):  
C. Lam ◽  
K. Gray ◽  
M. Gall ◽  
R. Sadsad ◽  
A. Arnott ◽  
...  

SARS-CoV-2 genomic surveillance has been vital in understanding the spread of COVID-19, the emergence of viral escape mutants and variants of concern. However, low viral loads in clinical specimens affect variant calling for phylogenetic analyses and detection of low frequency variants, important in uncovering infection transmission chains. We systematically evaluated three widely adopted SARS-CoV-2 whole genome sequencing methods for their sensitivity, specificity, and ability to reliably detect low frequency variants. Our analyses highlight that the ARTIC v3 protocol consistently displays high sensitivity for generating complete genomes at low viral loads compared with the probe-based Illumina respiratory viral oligo panel, and a pooled long-amplicon method. We show substantial variability in the number and location of low-frequency variants detected using the three methods, highlighting the importance of selecting appropriate methods to obtain high quality sequence data from low viral load samples for public health and genomic surveillance purposes.


2018 ◽  
Author(s):  
Zoltán Maróti ◽  
Zsolt Boldogkői ◽  
Dóra Tombácz ◽  
Michael Snyder ◽  
Tibor Kalmár

ABSTRACTUnderstanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely used methods for collecting variant information at the DNA-level include whole genome sequencing, which continues to remain costly, and the more economical solution of array-based techniques, as these are capable of simultaneously genotyping a pre-selected set of variable DNA sites in the human genome. The largest publicly accessible set of human genomic sequence data available today originates from exome sequencing that comprises around 1.2% of the whole genome (approximately 30 million base pairs). In this study, we compared the application of the exome dataset to the array-based dataset and to the gold standard whole genome dataset using the same population genetic analysis methods. Our results draw attention to some of the inherent problems that arise from using pre-selected SNP sets for population genetic analysis. Additionally, we demonstrate that exome sequencing provides a better alternative to the array-based methods for population genetic analysis. In this study, we propose a strategy for unbiased variant collection from exome data and offer a bioinformatics protocol for proper data processing.


2021 ◽  
Author(s):  
Connie Lam ◽  
Karen-Ann Gray ◽  
Mailie Gall ◽  
Rosemarie Sadsad ◽  
Alicia Arnott ◽  
...  

SARS-CoV-2 genomic surveillance has been vital in understanding the spread of COVID-19, the emergence of viral escape mutants and variants of concern. However, low viral loads in clinical specimens affect variant calling for phylogenetic analyses and detection of low frequency variants, important in uncovering infection transmission chains. We systematically evaluated three widely adopted SARS-CoV-2 whole genome sequencing methods for their sensitivity, specificity, and ability to reliably detect low frequency variants. Our analyses highlight that the ARTIC v3 protocol consistently displays high sensitivity for generating complete genomes at low viral loads compared with the probe-based Illumina respiratory viral oligo panel, and a pooled long-amplicon method. We show substantial variability in the number and location of low-frequency variants detected using the three methods, highlighting the importance of selecting appropriate methods to obtain high quality sequence data from low viral load samples for public health and genomic surveillance purposes.


2019 ◽  
Vol 8 (4) ◽  
pp. 10105-10109

Exact identification of pulmonary nodules with high sensitivity and specificity is basic for programmed lung malignancy analysis from CT scans. In fact, many deep learning-based algorithms gain incredible ground for improving the exactness of nodule recognition; the high false positive rate is yet a difficult issue which restricted the programmed determination. We propose a novel customized Deep Convolutional Neural Network (DCNN) architecture for learning high-level image representation to achieve high classification accuracy with low variance in medical image binary classification tasks. Moreover, a High Sensitivity and Specificity system is introduced to eliminate the erroneously recognized nodule competitors by following the appearance changes in consistent CT slices of every nodule. The proposed structure is assessed on the open Kaggle Data Science Bowl (KDSB17) challenge dataset. Our strategy can precisely distinguish lung nodules at high sensitivity and specificity and accomplishes 95 % sensitivity.


2010 ◽  
Vol 48 (08) ◽  
Author(s):  
A Rosenthal ◽  
H Köppen ◽  
R Musikowski ◽  
R Schwanitz ◽  
J Behrendt ◽  
...  

2019 ◽  
Vol 26 (11) ◽  
pp. 1946-1959 ◽  
Author(s):  
Le Minh Tu Phan ◽  
Lemma Teshome Tufa ◽  
Hwa-Jung Kim ◽  
Jaebeom Lee ◽  
Tae Jung Park

Background:Tuberculosis (TB), one of the leading causes of death worldwide, is difficult to diagnose based only on signs and symptoms. Methods for TB detection are continuously being researched to design novel effective clinical tools for the diagnosis of TB.Objective:This article reviews the methods to diagnose TB at the latent and active stages and to recognize prospective TB diagnostic methods based on nanomaterials.Methods:The current methods for TB diagnosis were reviewed by evaluating their advantages and disadvantages. Furthermore, the trends in TB detection using nanomaterials were discussed regarding their performance capacity for clinical diagnostic applications.Results:Current methods such as microscopy, culture, and tuberculin skin test are still being employed to diagnose TB, however, a highly sensitive point of care tool without false results is still needed. The utilization of nanomaterials to detect the specific TB biomarkers with high sensitivity and specificity can provide a possible strategy to rapidly diagnose TB. Although it is challenging for nanodiagnostic platforms to be assessed in clinical trials, active TB diagnosis using nanomaterials is highly expected to achieve clinical significance for regular application. In addition, aspects and future directions in developing the high-efficiency tools to diagnose active TB using advanced nanomaterials are expounded.Conclusion:This review suggests that nanomaterials have high potential as rapid, costeffective tools to enhance the diagnostic sensitivity and specificity for the accurate diagnosis, treatment, and prevention of TB. Hence, portable nanobiosensors can be alternative effective tests to be exploited globally after clinical trial execution.


Sign in / Sign up

Export Citation Format

Share Document