multiple genome Latest Research Papers

AbstractThe assembled and annotated genomes for 16 inbred mouse strains (Lilue et al., Nat Genet 50:1574–1583, 2018) and two wild-derived strains (CAROLI/EiJ and PAHARI/EiJ) (Thybert et al., Genome Res 28:448–459, 2018) are valuable resources for mouse genetics and comparative genomics. We developed the multiple genome viewer (MGV; http://www.informatics.jax.org/mgv) to support visualization, exploration, and comparison of genome annotations within and across these genomes. MGV displays chromosomal regions of user-selected genomes as horizontal tracks. Equivalent features across the genome tracks are highlighted using vertical ‘swim lane’ connectors. Navigation across the genomes is synchronized as a researcher uses the scroll and zoom functions. Researchers can generate custom sets of genes and other genome features to be displayed in MGV by entering genome coordinates, function, phenotype, disease, and/or pathway terms. MGV was developed to be genome agnostic and can be used to display homologous features across genomes of different organisms.

Download Full-text

A SacB-based system for diverse and multiple genome editing in Gluconobacter oxydans

Journal of Biotechnology ◽

10.1016/j.jbiotec.2021.07.004 ◽

2021 ◽

Author(s):

Zhijie Qin ◽

Shiqin Yu ◽

Li Liu ◽

Lingling Wang ◽

Jian Chen ◽

...

Keyword(s):

Genome Editing ◽

Gluconobacter Oxydans ◽

Multiple Genome

Download Full-text

Pattern Detection in Multiple Genome Sequences with Applications: The Case of All SARS-CoV-2 Complete Variants

10.1101/2021.04.14.439840 ◽

2021 ◽

Author(s):

Konstantinos Xylogiannopoulos

Keyword(s):

Sequence Alignment ◽

Tandem Repeats ◽

Genomic Analysis ◽

Pattern Detection ◽

Accelerated Expansion ◽

Genome Sequences ◽

Multiple Genome ◽

Data Structures And Algorithms ◽

Meta Analyses ◽

Genome Comparisons

Pattern detection and string matching are fundamental problems in computer science and the accelerated expansion of bioinformatics and computational biology have made them a core topic for both disciplines. The SARS-CoV-2 pandemic has made such problems more demanding with hundreds or thousands of new genome variants discovered every week, because of constant mutations, and the need for fast and accurate analyses. Medicines and, mostly, vaccines must be altered to adapt and efficiently address mutations. The need of computational tools for genomic analysis, such as sequence alignment, is very important, although, in most cases the resources and computational power needed is vast. The presented data structures and algorithms, specifically built for text mining and pattern detection, can help to address efficiently several bioinformatics problems. With a single execution of advanced algorithms, with limited space and time complexity, it is possible to acquire knowledge on all repeated patterns that exist in multiple genome sequences and this information can be used for further meta analyses. The potentials of the presented solutions are demonstrated with the analysis of more than 55,000 SARS-CoV-2 genome sequences (collected on March 10, 2021) and the detection of all repeated patterns with length up to 60 nucleotides in these sequences, something practically impossible with other algorithms due to its complexity. These results can be used to help provide answers to questions such as all variants common patterns, sequence alignment, palindromes and tandem repeats detection, genome comparisons, etc.

Download Full-text

Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data

Nature Communications ◽

10.1038/s41467-021-22381-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Michael J. Cormier ◽

Jonathan R. Belyeu ◽

Brent S. Pedersen ◽

Joseph Brown ◽

Johannes Köster ◽

...

Keyword(s):

Data Integration ◽

Genomic Data ◽

Data Types ◽

Multiple Genome ◽

Link Type ◽

Wide Range

AbstractThe rapid increase in the amount of genomic data provides researchers with an opportunity to integrate diverse datasets and annotations when addressing a wide range of biological questions. However, genomic datasets are deposited on different platforms and are stored in numerous formats from multiple genome builds, which complicates the task of collecting, annotating, transforming, and integrating data as needed. Here, we developed Go Get Data (GGD) as a fast, reproducible approach to installing standardized data recipes. GGD is available on Github (https://gogetdata.github.io/), is extendable to other data types, and can streamline the complexities typically associated with data integration, saving researchers time and improving research reproducibility.

Download Full-text

Scaling up multiple-genome alignments

Nature Methods ◽

10.1038/s41592-020-01045-8 ◽

2021 ◽

Vol 18 (1) ◽

pp. 33-33

Author(s):

Lin Tang

Keyword(s):

Scaling Up ◽

Multiple Genome

Download Full-text

MBGC: Multiple Bacteria Genome Compressor

10.1101/2020.12.09.411678 ◽

2020 ◽

Author(s):

Szymon Grabowski ◽

Tomasz M. Kowalski

Keyword(s):

Compression Ratio ◽

Bacterial Genomes ◽

Multiple Genome ◽

Order Of Magnitude ◽

Mammalian Genomes

AbstractSummaryGenomes within the same species reveal large similarity, exploited by specialized multiple genome compressors. The existing algorithms and tools are however targeted at large, e.g., mammalian, genomes, and their performance on bacteria strains is mediocre. In this work, we propose MBGC, a specialized genome compressor making use of specific redundancy of bacterial genomes. Our tool is not only compression efficient, but also fast. On a collection of 168,311 bacterial genomes, totalling 587 GB, we achieve the compression ratio around the factor of 730, and the compression (resp. decompression) speed around 1070 MB/s (resp. 740 MB/s) using 8 hardware threads, on a computer with a 6-core / 12-thread CPU and a fast SSD, being about 4 times more succinct and more than an order of magnitude faster in the compression than our main competitors.Availability and implementationMBGC is freely available at github.com/kowallus/mbgc.

Download Full-text

Progressive Cactus is a multiple-genome aligner for the thousand-genome era

Nature ◽

10.1038/s41586-020-2871-y ◽

2020 ◽

Vol 587 (7833) ◽

pp. 246-251 ◽

Cited By ~ 5

Author(s):

Joel Armstrong ◽

Glenn Hickey ◽

Mark Diekhans ◽

Ian T. Fiddes ◽

Adam M. Novak ◽

...

Keyword(s):

Large Scale ◽

De Novo ◽

Genome Alignment ◽

Vertebrate Genome ◽

Multiple Genome ◽

Sequencing Technologies ◽

Third Generation Sequencing ◽

Genome Assemblies ◽

Information Database ◽

Generation Sequencing

AbstractNew genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1–3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.

Download Full-text

Search and comparison of (epi)genomic feature patterns in multiple genome browser tracks

BMC Bioinformatics ◽

10.1186/s12859-020-03781-2 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Arnaud Ceol ◽

Piero Montanari ◽

Ilaria Bartolini ◽

Stefano Ceri ◽

Paolo Ciaccia ◽

...

Keyword(s):

Pattern Search ◽

Genome Browser ◽

Use Cases ◽

Genomic Feature ◽

Entire Genome ◽

Multiple Genome ◽

Link Type ◽

Genome Wide ◽

Biological Interest ◽

Genomic Regions

Abstract Background Genome browsers are widely used for locating interesting genomic regions, but their interactive use is obviously limited to inspecting short genomic portions. An ideal interaction is to provide patterns of regions on the browser, and then extract other genomic regions over the whole genome where such patterns occur, ranked by similarity. Results We developed SimSearch, an optimized pattern-search method and an open source plugin for the Integrated Genome Browser (IGB), to find genomic region sets that are similar to a given region pattern. It provides efficient visual genome-wide analytics computation in large datasets; the plugin supports intuitive user interactions for selecting an interesting pattern on IGB tracks and visualizing the computed occurrences of similar patterns along the entire genome. SimSearch also includes functions for the annotation and enrichment of results, and is enhanced with a Quickload repository including numerous epigenomic feature datasets from ENCODE and Roadmap Epigenomics. The paper also includes some use cases to show multiple genome-wide analyses of biological interest, which can be easily performed by taking advantage of the presented approach. Conclusions The novel SimSearch method provides innovative support for effective genome-wide pattern search and visualization; its relevance and practical usefulness is demonstrated through a number of significant use cases of biological interest. The SimSearch IGB plugin, documentation, and code are freely available at https://deib-geco.github.io/simsearch-app/ and https://github.com/DEIB-GECO/simsearch-app/.

Download Full-text

Go Get Data (GGD): simple, reproducible access to scientific data

10.1101/2020.09.10.291377 ◽

2020 ◽

Author(s):

Michael J. Cormier ◽

Jonathan R. Belyeu ◽

Brent S. Pedersen ◽

Joseph Brown ◽

Johannes Koster ◽

...

Keyword(s):

Scientific Data ◽

Inherent Difficulty ◽

Multiple Genome ◽

Link Type ◽

Genomics Research

AbstractGenomics research is complicated by the inherent difficulty of collecting, transforming, and integrating the numerous datasets and annotations germane to one’s research. Furthermore, these data exist in disparate sources, and are stored in numerous, often abused formats from multiple genome builds. Since these complexities waste time, inhibit reproducibility, and curtail research creativity, we developed Go Get Data (GGD; https://gogetdata.github.io/) as a fast, reproducible approach to installing standardized data recipes.

Download Full-text

CRISPR/Cas with ribonucleoprotein complexes and transiently selected telomere vectors allows highly efficient marker-free and multiple genome editing in Botrytis cinerea

PLoS Pathogens ◽

10.1371/journal.ppat.1008326 ◽

2020 ◽

Vol 16 (8) ◽

pp. e1008326 ◽

Cited By ~ 2

Author(s):

Thomas Leisen ◽

Fabian Bietz ◽

Janina Werner ◽

Alex Wegner ◽

Ulrich Schaffrath ◽

...

Keyword(s):

Genome Editing ◽

Botrytis Cinerea ◽

Multiple Genome ◽

Highly Efficient ◽

Ribonucleoprotein Complexes ◽

Marker Free

Download Full-text

multiple genome
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multiple genome viewer (MGV): a new tool for visualization and comparison of multiple annotated genomes

A SacB-based system for diverse and multiple genome editing in Gluconobacter oxydans

Pattern Detection in Multiple Genome Sequences with Applications: The Case of All SARS-CoV-2 Complete Variants

Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data

Scaling up multiple-genome alignments

MBGC: Multiple Bacteria Genome Compressor

Progressive Cactus is a multiple-genome aligner for the thousand-genome era

Search and comparison of (epi)genomic feature patterns in multiple genome browser tracks

Go Get Data (GGD): simple, reproducible access to scientific data

CRISPR/Cas with ribonucleoprotein complexes and transiently selected telomere vectors allows highly efficient marker-free and multiple genome editing in Botrytis cinerea

Export Citation Format

multiple genomeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multiple genome viewer (MGV): a new tool for visualization and comparison of multiple annotated genomes

A SacB-based system for diverse and multiple genome editing in Gluconobacter oxydans

Pattern Detection in Multiple Genome Sequences with Applications: The Case of All SARS-CoV-2 Complete Variants

Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data

Scaling up multiple-genome alignments

MBGC: Multiple Bacteria Genome Compressor

Progressive Cactus is a multiple-genome aligner for the thousand-genome era

Search and comparison of (epi)genomic feature patterns in multiple genome browser tracks

Go Get Data (GGD): simple, reproducible access to scientific data

CRISPR/Cas with ribonucleoprotein complexes and transiently selected telomere vectors allows highly efficient marker-free and multiple genome editing in Botrytis cinerea

multiple genome
Recently Published Documents