scholarly journals Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ

iScience ◽  
2020 ◽  
Vol 23 (6) ◽  
pp. 101224 ◽  
Author(s):  
Ilia Minkin ◽  
Paul Medvedev
Keyword(s):  
2018 ◽  
Vol 34 (17) ◽  
pp. i748-i756 ◽  
Author(s):  
Chirag Jain ◽  
Sergey Koren ◽  
Alexander Dilthey ◽  
Adam M Phillippy ◽  
Srinivas Aluru

2019 ◽  
Author(s):  
DJ Darwin R. Bandoy ◽  
B Carol Huang ◽  
Bart C. Weimer

AbstractTaxonomic classification is an essential step in the analysis of microbiome data that depends on a reference database of whole genome sequences. Taxonomic classifiers are built on established reference species, such as the Human Microbiome Project database, that is growing rapidly. While constructing a population wide pangenome of the bacterium Hungatella, we discovered that the Human Microbiome Project reference species Hungatella hathewayi (WAL 18680) was significantly different to other members of this genus. Specifically, the reference lacked the core genome as compared to the other members. Further analysis, using average nucleotide identity (ANI) and 16s rRNA comparisons, indicated that WAL18680 was misclassified as Hungatella. The error in classification is being amplified in the taxonomic classifiers and will have a compounding effect as microbiome analyses are done, resulting in inaccurate assignment of community members and will lead to fallacious conclusions and possibly treatment. As automated genome homology assessment expands for microbiome analysis, outbreak detection, and public health reliance on whole genomes increases this issue will likely occur at an increasing rate. These observations highlight the need for developing reference free methods for epidemiological investigation using whole genome sequences and the criticality of accurate reference databases.


2008 ◽  
Vol 4 (5) ◽  
pp. 363 ◽  
Author(s):  
Peter F. Hallin ◽  
Tim T. Binnewies ◽  
David W. Ussery
Keyword(s):  

2018 ◽  
Author(s):  
Chirag Jain ◽  
Sergey Koren ◽  
Alexander Dilthey ◽  
Adam M. Phillippy ◽  
Srinivas Aluru

AbstractMotivationWhole-genome alignment is an important problem in genomics for comparing different species, mapping draft assemblies to reference genomes, and identifying repeats. However, for large plant and animal genomes, this task remains compute and memory intensive.ResultsWe introduce an approximate algorithm for computing local alignment boundaries between long DNA sequences. Given a minimum alignment length and an identity threshold, our algorithm computes the desired alignment boundaries and identity estimates using kmer-based statistics, and maintains sufficient probabilistic guarantees on the output sensitivity. Further, to prioritize higher scoring alignment intervals, we develop a plane-sweep based filtering technique which is theoretically optimal and practically efficient. Implementation of these ideas resulted in a fast and accurate assembly-to-genome and genome-to-genome mapper. As a result, we were able to map an error-corrected whole-genome NA12878 human assembly to the hg38 human reference genome in about one minute total execution time and < 4 GB memory using 8 CPU threads, achieving significant performance improvement over competing methods. Recall accuracy of computed alignment boundaries was consistently found to be > 97% on multiple datasets. Finally, we performed a sensitive self-alignment of the human genome to compute all duplications of length ≥ 1 Kbp and ≥ 90% identity. The reported output achieves good recall and covers 5% more bases than the current UCSC genome browser's segmental duplication annotation.Availabilityhttps://github.com/marbl/[email protected], [email protected]


2016 ◽  
Vol 15 (27) ◽  
pp. 1464-1475 ◽  
Author(s):  
Dossa Komivi ◽  
Niang Mareme ◽  
E Assogbadjo Achille ◽  
Cisse Ndiaga ◽  
Diouf Diaga

2012 ◽  
Vol 154 (1) ◽  
pp. 19-25 ◽  
Author(s):  
V. Jandova ◽  
J. Klukowska-Rötzler ◽  
G. Dolf ◽  
J. Janda ◽  
P. Roosje ◽  
...  

2013 ◽  
Vol 70 (11) ◽  
pp. 621-631 ◽  
Author(s):  
Deborah Bartholdi ◽  
Peter Miny

Neue Schlüsseltechnologien führen gegenwärtig zu einem grundlegenden Wandel im klinischen Einsatz genetischer Labordiagnostik. In der Pränataldiagnostik hat die nicht invasive Abklärung von Aneuploidien im mütterliche Blut Fuß gefasst (NIPT) und dieser Ansatz wird in Zukunft auch bei anderen Chromosomenstörungen und Fragestellungen (monogene Erkrankungen) zum Einsatz kommen. Im postnatalen Bereich hat die Microarray Analyse (Array-CGH, molekulare Karyotypisierung) die konventionelle Chromosomenanalyse bei der Abklärung von Kindern mit Fehlbildungen, einer nicht-syndromalen geistigen Behinderung oder Autismusspektrumstörung abgelöst. Die neuen Hochdurchsatzsequenziermethoden erlauben die effiziente Abklärung von genetisch sehr heterogenen Krankheitsbildern wie z. B. Epilepsien, neuromuskuläre Erkrankungen und Schwerhörigkeit, durch Diagnostik-Panels, bei welchen Dutzende von Genen parallel analysiert werden können. Der Einsatz der Exom oder whole genome Sequenzierung als wissenschaftliche Methode zur Identifizierung von neuen Krankheitsgenen wird auch in der Diagnostik von schweren ungeklärten Erkrankungen oder Entwicklungsstörungen, die genetisch extrem heterogen sind, zum Einsatz kommen. Die neuen Methoden werden die klinische Diagnostik in der Pädiatrie und anderen Bereichen der Medizin über kurz oder lang verändern, indem die genetische Labordiagnostik eher früher im Abklärungsprozess zur Anwendung kommen wird (genetics first).


2018 ◽  
Author(s):  
Mark Stevenson ◽  
Alistair T Pagnamenta ◽  
Heather G Mack ◽  
Judith A Savige ◽  
Kate E Lines ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document