genome projects Latest Research Papers

Non-vertebrate species represent about ~95% of known metazoan (animal) diversity. They remain to this day relatively unexplored genetically, but understanding their genome structure and function is pivotal for expanding our current knowledge of evolution, ecology and biodiversity. Following the continuous improvements and decreasing costs of sequencing technologies, many genome assembly tools have been released, leading to a significant amount of genome projects being completed in recent years. In this review, we examine the current state of genome projects of non-vertebrate animal species. We present an overview of available sequencing technologies, assembly approaches, as well as pre and post-processing steps, genome assembly evaluation methods, and their application to non-vertebrate animal genomes.

Download Full-text

Applications of CRISPR/Cas Technology to Research the Synthetic Genomics of Yeast

10.5772/intechopen.100561 ◽

2021 ◽

Author(s):

Huafeng Lin ◽

Haizhen Wang ◽

Aimin Deng ◽

Minjing Rong ◽

Lei Ye ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

Synthetic Biology ◽

Gene Editing ◽

Functional Genes ◽

Whole Genome ◽

Yeast Saccharomyces Cerevisiae ◽

Synthetic Genome ◽

The World ◽

Synthetic Genomics ◽

Genome Projects

The whole genome projects open the prelude to the diversity and complexity of biological genome by generating immense data. For the sake of exploring the riddle of the genome, scientists around the world have dedicated themselves in annotating for these massive data. However, searching for the exact and valuable information is like looking for a needle in a haystack. Advances in gene editing technology have allowed researchers to precisely manipulate the targeted functional genes in the genome by the state-of-the-art gene-editing tools, so as to facilitate the studies involving the fields of biology, agriculture, food industry, medicine, environment and healthcare in a more convenient way. As a sort of pioneer editing devices, the CRISPR/Cas systems having various versatile homologs and variants, now are rapidly giving impetus to the development of synthetic genomics and synthetic biology. Firstly, in the chapter, we will present the classification, structural and functional diversity of CRISPR/Cas systems. Then we will emphasize the applications in synthetic genome of yeast (Saccharomyces cerevisiae) using CRISPR/Cas technology based on year order. Finally, the summary and prospection of synthetic genomics as well as synthetic biotechnology based on CRISPR/Cas systems and their further utilizations in yeast are narrated.

Download Full-text

rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects

BMC Bioinformatics ◽

10.1186/s12859-021-04384-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Manon Geerts ◽

Achim Schnaufer ◽

Frederik Van den Broeck

Keyword(s):

R Package ◽

Population Diversity ◽

Sequence Diversity ◽

Mitochondrial Genomes ◽

Leishmania Braziliensis ◽

Sequence Alignments ◽

Parasitic Protozoa ◽

Population Scale ◽

Genome Projects ◽

Minicircle Sequence

Abstract Background The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution of their mitochondrial genomes remains fragmentary, mainly because of their extraordinary complexity. Indeed, unicellular flagellates of the order Kinetoplastida contain structurally the most complex mitochondrial genome of all eukaryotes, organized as a giant network of homogeneous maxicircles and heterogeneous minicircles. We recently developed KOMICS, an analysis toolkit that automates the assembly and circularization of the mitochondrial genomes of Kinetoplastid parasites. While this tool overcomes the limitation of extracting mitochondrial assemblies from Next-Generation Sequencing datasets, interpreting and visualizing the genetic (dis)similarity within and between samples remains a time-consuming process. Results Here, we present a new analysis toolkit—rKOMICS—to streamline the analyses of minicircle sequence diversity in population-scale genome projects. rKOMICS is a user-friendly R package that has simple installation requirements and that is applicable to all 27 trypanosomatid genera. Once minicircle sequence alignments are generated, rKOMICS allows to examine, summarize and visualize minicircle sequence diversity within and between samples through the analyses of minicircle sequence clusters. We showcase the functionalities of the (r)KOMICS tool suite using a whole-genome sequencing dataset from a recently published study on the history of diversification of the Leishmania braziliensis species complex in Peru. Analyses of population diversity and structure highlighted differences in minicircle sequence richness and composition between Leishmania subspecies, and between subpopulations within subspecies. Conclusion The rKOMICS package establishes a critical framework to manipulate, explore and extract biologically relevant information from mitochondrial minicircle assemblies in tens to hundreds of samples simultaneously and efficiently. This should facilitate research that aims to develop new molecular markers for identifying species-specific minicircles, or to study the ancestry of parasites for complementary insights into their evolutionary history.

Download Full-text

The Autophagy Machinery in Human-Parasitic Protists; Diverse Functions for Universally Conserved Proteins

Cells ◽

10.3390/cells10051258 ◽

2021 ◽

Vol 10 (5) ◽

pp. 1258

Author(s):

Hirokazu Sakamoto ◽

Kumiko Nakada-Tsukui ◽

Sébastien Besteiro

Keyword(s):

Membrane Vesicle ◽

Model Organisms ◽

Unicellular Eukaryotes ◽

Current State ◽

Important Challenge ◽

Molecular Machinery ◽

Cellular Machinery ◽

Apparent Reduction ◽

Related Proteins ◽

Genome Projects

Autophagy is a eukaryotic cellular machinery that is able to degrade large intracellular components, including organelles, and plays a pivotal role in cellular homeostasis. Target materials are enclosed by a double membrane vesicle called autophagosome, whose formation is coordinated by autophagy-related proteins (ATGs). Studies of yeast and Metazoa have identified approximately 40 ATGs. Genome projects for unicellular eukaryotes revealed that some ATGs are conserved in all eukaryotic supergroups but others have arisen or were lost during evolution in some specific lineages. In spite of an apparent reduction in the ATG molecular machinery found in parasitic protists, it has become clear that ATGs play an important role in stage differentiation or organelle maintenance, sometimes with an original function that is unrelated to canonical degradative autophagy. In this review, we aim to briefly summarize the current state of knowledge in parasitic protists, in the light of the latest important findings from more canonical model organisms. Determining the roles of ATGs and the diversity of their functions in various lineages is an important challenge for understanding the evolutionary background of autophagy.

Download Full-text

Molluscan phylogenomics requires strategically selected genomes

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2020.0161 ◽

2021 ◽

Vol 376 (1825) ◽

Cited By ~ 2

Author(s):

Julia D. Sigwart ◽

David R. Lindberg ◽

Chong Chen ◽

Jin Sun

Keyword(s):

Small Sample ◽

Whole Genome ◽

Future Directions ◽

History Of ◽

Living Species ◽

Genome Assemblies ◽

Optimal Set ◽

Medical Interest ◽

Genome Projects ◽

Selection Of

The extraordinary diversity in molluscan body plans, and the genomic mechanisms that enable it, remains one of the great questions of evolution. The eight distinct living taxonomic classes of molluscs are each unambiguously monophyletic; however, significant controversy remains about the phylogenetic relationships among those eight branches. Molluscs are the second-largest animal phylum, with over 100 000 living species with broad biological, economic and medical interest. To date, only around 53 genome assemblies have been accessioned to NCBI GenBank covering only four of the eight living molluscan classes. Furthermore, the molluscan taxa where partial or whole-genome assemblies are available are often aberrantly fast evolving or recently derived lineages. Characteristic adaptations provide interesting targets for whole-genome projects, in animals like the scaly-foot snail or octopus, but without basal-branching lineages for comparison, the context of recently derived features cannot be assessed. The currently available genomes also create a non-optimal set of taxa for resolving deeper phylogenetic branches: they are a small sample representing a large group, and those that are available come primarily from a rarefied pool. Thoughtful selection of taxa for future projects should focus on the blank areas of the molluscan tree, which are ripe with opportunities to delve into peculiarities of genome evolution, and reveal the biology and evolutionary history of molluscs. This article is part of the Theo Murphy meeting issue ‘Molluscan genomics: broad insights and future directions for a neglected phylum’.

Download Full-text

Torix Rickettsia are widespread in arthropods and reflect a neglected symbiosis

GigaScience ◽

10.1093/gigascience/giab021 ◽

2021 ◽

Vol 10 (3) ◽

Cited By ~ 1

Author(s):

Jack Pilgrim ◽

Panupong Thongprem ◽

Helen R Davison ◽

Stefanos Siozios ◽

Matthew Baylis ◽

...

Keyword(s):

Hot Spots ◽

Hot Spot ◽

Significant Proportion ◽

Intracellular Bacteria ◽

Future Directions ◽

Previous Hypothesis ◽

Causative Agents ◽

Phloem Feeding ◽

Genome Projects ◽

Human And Animal Diseases

Abstract Background Rickettsia are intracellular bacteria best known as the causative agents of human and animal diseases. Although these medically important Rickettsia are often transmitted via haematophagous arthropods, other Rickettsia, such as those in the Torix group, appear to reside exclusively in invertebrates and protists with no secondary vertebrate host. Importantly, little is known about the diversity or host range of Torix group Rickettsia. Results This study describes the serendipitous discovery of Rickettsia amplicons in the Barcode of Life Data System (BOLD), a sequence database specifically designed for the curation of mitochondrial DNA barcodes. Of 184,585 barcode sequences analysed, Rickettsia is observed in ∼0.41% of barcode submissions and is more likely to be found than Wolbachia (0.17%). The Torix group of Rickettsia are shown to account for 95% of all unintended amplifications from the genus. A further targeted PCR screen of 1,612 individuals from 169 terrestrial and aquatic invertebrate species identified mostly Torix strains and supports the “aquatic hot spot” hypothesis for Torix infection. Furthermore, the analysis of 1,341 SRA deposits indicates that Torix infections represent a significant proportion of all Rickettsia symbioses found in arthropod genome projects. Conclusions This study supports a previous hypothesis that suggests that Torix Rickettsia are overrepresented in aquatic insects. In addition, multiple methods reveal further putative hot spots of Torix Rickettsia infection, including in phloem-feeding bugs, parasitoid wasps, spiders, and vectors of disease. The unknown host effects and transmission strategies of these endosymbionts make these newly discovered associations important to inform future directions of investigation involving the understudied Torix Rickettsia.

Download Full-text

IGD: high-performance search for large-scale genomic interval datasets

Bioinformatics ◽

10.1093/bioinformatics/btaa1062 ◽

2020 ◽

Author(s):

Jianglin Feng ◽

Nathan C Sheffield

Keyword(s):

High Performance ◽

Large Scale ◽

Interval Data ◽

Scale Analysis ◽

Genome Database ◽

Genomic Interval ◽

Critical Resource ◽

Genomic Regions ◽

Genome Projects

Abstract Summary Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions. Availability https://github.com/databio/IGD

Download Full-text

Computational Personalized Medicine in Cancer Research in the -Omics Data Era

Research and Development on Information and Communication Technology ◽

10.32913/mic-ict-research.v2020.n1.899 ◽

2020 ◽

Vol 2020 (1) ◽

pp. 24-31

Author(s):

Lê Đức Hậu ◽

Quynh Diep Nguyen

Keyword(s):

Cancer Research ◽

Personalized Medicine ◽

Data Fusion ◽

Diagnosis And Treatment ◽

Molecular Characteristics ◽

Omics Data ◽

Patient Stratification ◽

Medicine Research ◽

Personalized Diagnosis ◽

Genome Projects

Omics data (e.g., genomics, transcriptomics, proteomics, epigenomics, etc . . . ) generated from high-throughputnext-generation sequencers in the big human genome, andcancer genome projects have changed the way to studypersonalized medicine. In the future, personalized medicinewill not be limited to diagnosis and treatment based on afew known disease-associated mutations on some genes, butwill rely on whole molecular characteristics of patients byintegrating their –omics data. In this study, we draw a bigpicture of personalized medicine research in cancer researchof the –omics data era, including –omics databases, challengesof data fusion to solve two major problems in personalizedmedicine, i.e., personalized diagnosis and treatment. Theseproblems are approached as patient stratification and drugresponse prediction based on the –omics data by computational methods.

Download Full-text

Genomes OnLine Database (GOLD) v.8: overview and updates

Nucleic Acids Research ◽

10.1093/nar/gkaa983 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D723-D733

Author(s):

Supratim Mukherjee ◽

Dimitri Stamatis ◽

Jon Bertsch ◽

Galina Ovchinnikova ◽

Jagadish Chandrabose Sundaramurthi ◽

...

Keyword(s):

Controlled Vocabulary ◽

Current Status ◽

Sequence Information ◽

Online Database ◽

Current Version ◽

Microbial Genomes ◽

Wide Range ◽

Four Levels ◽

Public Repositories ◽

Genome Projects

Abstract The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.

Download Full-text

Ultra-fast Prediction of Somatic Structural Variations by Reduced Read Mapping via Pan-Genome k-mer Sets

10.1101/2020.10.25.354456 ◽

2020 ◽

Author(s):

Min-Hak Choi ◽

Jang-il Sohn ◽

Dohun Yi ◽

A Vipin Menon ◽

Yeon Jeong Kim ◽

...

Keyword(s):

Structural Variation ◽

Copy Number Alterations ◽

Structural Variations ◽

Read Mapping ◽

High Performing ◽

Computational Costs ◽

Pan Genome ◽

Whole Genomes ◽

Fast Prediction ◽

Genome Projects

ABSTRACTGenome rearrangements often result in copy number alterations of cancer-related genes and cause the formation of cancer-related fusion genes. Current structural variation (SV) callers, however, still produce massive numbers of false positives (FPs) and require high computational costs. Here, we introduce an ultra-fast and high-performing somatic SV detector, called ETCHING, that significantly reduces the mapping cost by filtering reads matched to pan-genome and normal k-mer sets. To reduce the number of FPs, ETCHING takes advantage of a Random Forest classifier that utilizes six breakend-related features. We systematically benchmarked ETCHING with other SV callers on reference SV materials, validated SV biomarkers, tumor and matched-normal whole genomes, and tumor-only targeted sequencing datasets. For all datasets, our SV caller was much faster (≥15X) than other tools without compromising performance or memory use. Our approach would provide not only the fastest method for largescale genome projects but also an accurate clinically practical means for real-time precision medicine.

Download Full-text

genome projects
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Deep Dive into Genome Assemblies of Non-vertebrate Animals

Applications of CRISPR/Cas Technology to Research the Synthetic Genomics of Yeast

rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects

The Autophagy Machinery in Human-Parasitic Protists; Diverse Functions for Universally Conserved Proteins

Molluscan phylogenomics requires strategically selected genomes

Torix Rickettsia are widespread in arthropods and reflect a neglected symbiosis

IGD: high-performance search for large-scale genomic interval datasets

Computational Personalized Medicine in Cancer Research in the -Omics Data Era

Genomes OnLine Database (GOLD) v.8: overview and updates

Ultra-fast Prediction of Somatic Structural Variations by Reduced Read Mapping via Pan-Genome k-mer Sets

Export Citation Format

genome projectsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Deep Dive into Genome Assemblies of Non-vertebrate Animals

Applications of CRISPR/Cas Technology to Research the Synthetic Genomics of Yeast

rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects

The Autophagy Machinery in Human-Parasitic Protists; Diverse Functions for Universally Conserved Proteins

Molluscan phylogenomics requires strategically selected genomes

Torix Rickettsia are widespread in arthropods and reflect a neglected symbiosis

IGD: high-performance search for large-scale genomic interval datasets

Computational Personalized Medicine in Cancer Research in the -Omics Data Era

Genomes OnLine Database (GOLD) v.8: overview and updates

Ultra-fast Prediction of Somatic Structural Variations by Reduced Read Mapping via Pan-Genome k-mer Sets

genome projects
Recently Published Documents