genome projects
Recently Published Documents


TOTAL DOCUMENTS

170
(FIVE YEARS 22)

H-INDEX

25
(FIVE YEARS 2)

Author(s):  
Nadège Guiglielmoni ◽  
Ramón Rivera-Vicéns ◽  
Romain Koszul ◽  
Jean-François Flot

Non-vertebrate species represent about ~95% of known metazoan (animal) diversity. They remain to this day relatively unexplored genetically, but understanding their genome structure and function is pivotal for expanding our current knowledge of evolution, ecology and biodiversity. Following the continuous improvements and decreasing costs of sequencing technologies, many genome assembly tools have been released, leading to a significant amount of genome projects being completed in recent years. In this review, we examine the current state of genome projects of non-vertebrate animal species. We present an overview of available sequencing technologies, assembly approaches, as well as pre and post-processing steps, genome assembly evaluation methods, and their application to non-vertebrate animal genomes.


2021 ◽  
Author(s):  
Huafeng Lin ◽  
Haizhen Wang ◽  
Aimin Deng ◽  
Minjing Rong ◽  
Lei Ye ◽  
...  

The whole genome projects open the prelude to the diversity and complexity of biological genome by generating immense data. For the sake of exploring the riddle of the genome, scientists around the world have dedicated themselves in annotating for these massive data. However, searching for the exact and valuable information is like looking for a needle in a haystack. Advances in gene editing technology have allowed researchers to precisely manipulate the targeted functional genes in the genome by the state-of-the-art gene-editing tools, so as to facilitate the studies involving the fields of biology, agriculture, food industry, medicine, environment and healthcare in a more convenient way. As a sort of pioneer editing devices, the CRISPR/Cas systems having various versatile homologs and variants, now are rapidly giving impetus to the development of synthetic genomics and synthetic biology. Firstly, in the chapter, we will present the classification, structural and functional diversity of CRISPR/Cas systems. Then we will emphasize the applications in synthetic genome of yeast (Saccharomyces cerevisiae) using CRISPR/Cas technology based on year order. Finally, the summary and prospection of synthetic genomics as well as synthetic biotechnology based on CRISPR/Cas systems and their further utilizations in yeast are narrated.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Manon Geerts ◽  
Achim Schnaufer ◽  
Frederik Van den Broeck

Abstract Background The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution of their mitochondrial genomes remains fragmentary, mainly because of their extraordinary complexity. Indeed, unicellular flagellates of the order Kinetoplastida contain structurally the most complex mitochondrial genome of all eukaryotes, organized as a giant network of homogeneous maxicircles and heterogeneous minicircles. We recently developed KOMICS, an analysis toolkit that automates the assembly and circularization of the mitochondrial genomes of Kinetoplastid parasites. While this tool overcomes the limitation of extracting mitochondrial assemblies from Next-Generation Sequencing datasets, interpreting and visualizing the genetic (dis)similarity within and between samples remains a time-consuming process. Results Here, we present a new analysis toolkit—rKOMICS—to streamline the analyses of minicircle sequence diversity in population-scale genome projects. rKOMICS is a user-friendly R package that has simple installation requirements and that is applicable to all 27 trypanosomatid genera. Once minicircle sequence alignments are generated, rKOMICS allows to examine, summarize and visualize minicircle sequence diversity within and between samples through the analyses of minicircle sequence clusters. We showcase the functionalities of the (r)KOMICS tool suite using a whole-genome sequencing dataset from a recently published study on the history of diversification of the Leishmania braziliensis species complex in Peru. Analyses of population diversity and structure highlighted differences in minicircle sequence richness and composition between Leishmania subspecies, and between subpopulations within subspecies. Conclusion The rKOMICS package establishes a critical framework to manipulate, explore and extract biologically relevant information from mitochondrial minicircle assemblies in tens to hundreds of samples simultaneously and efficiently. This should facilitate research that aims to develop new molecular markers for identifying species-specific minicircles, or to study the ancestry of parasites for complementary insights into their evolutionary history.


Cells ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 1258
Author(s):  
Hirokazu Sakamoto ◽  
Kumiko Nakada-Tsukui ◽  
Sébastien Besteiro

Autophagy is a eukaryotic cellular machinery that is able to degrade large intracellular components, including organelles, and plays a pivotal role in cellular homeostasis. Target materials are enclosed by a double membrane vesicle called autophagosome, whose formation is coordinated by autophagy-related proteins (ATGs). Studies of yeast and Metazoa have identified approximately 40 ATGs. Genome projects for unicellular eukaryotes revealed that some ATGs are conserved in all eukaryotic supergroups but others have arisen or were lost during evolution in some specific lineages. In spite of an apparent reduction in the ATG molecular machinery found in parasitic protists, it has become clear that ATGs play an important role in stage differentiation or organelle maintenance, sometimes with an original function that is unrelated to canonical degradative autophagy. In this review, we aim to briefly summarize the current state of knowledge in parasitic protists, in the light of the latest important findings from more canonical model organisms. Determining the roles of ATGs and the diversity of their functions in various lineages is an important challenge for understanding the evolutionary background of autophagy.


Author(s):  
Julia D. Sigwart ◽  
David R. Lindberg ◽  
Chong Chen ◽  
Jin Sun

The extraordinary diversity in molluscan body plans, and the genomic mechanisms that enable it, remains one of the great questions of evolution. The eight distinct living taxonomic classes of molluscs are each unambiguously monophyletic; however, significant controversy remains about the phylogenetic relationships among those eight branches. Molluscs are the second-largest animal phylum, with over 100 000 living species with broad biological, economic and medical interest. To date, only around 53 genome assemblies have been accessioned to NCBI GenBank covering only four of the eight living molluscan classes. Furthermore, the molluscan taxa where partial or whole-genome assemblies are available are often aberrantly fast evolving or recently derived lineages. Characteristic adaptations provide interesting targets for whole-genome projects, in animals like the scaly-foot snail or octopus, but without basal-branching lineages for comparison, the context of recently derived features cannot be assessed. The currently available genomes also create a non-optimal set of taxa for resolving deeper phylogenetic branches: they are a small sample representing a large group, and those that are available come primarily from a rarefied pool. Thoughtful selection of taxa for future projects should focus on the blank areas of the molluscan tree, which are ripe with opportunities to delve into peculiarities of genome evolution, and reveal the biology and evolutionary history of molluscs. This article is part of the Theo Murphy meeting issue ‘Molluscan genomics: broad insights and future directions for a neglected phylum’.


GigaScience ◽  
2021 ◽  
Vol 10 (3) ◽  
Author(s):  
Jack Pilgrim ◽  
Panupong Thongprem ◽  
Helen R Davison ◽  
Stefanos Siozios ◽  
Matthew Baylis ◽  
...  

Abstract Background Rickettsia are intracellular bacteria best known as the causative agents of human and animal diseases. Although these medically important Rickettsia are often transmitted via haematophagous arthropods, other Rickettsia, such as those in the Torix group, appear to reside exclusively in invertebrates and protists with no secondary vertebrate host. Importantly, little is known about the diversity or host range of Torix group Rickettsia. Results This study describes the serendipitous discovery of Rickettsia amplicons in the Barcode of Life Data System (BOLD), a sequence database specifically designed for the curation of mitochondrial DNA barcodes. Of 184,585 barcode sequences analysed, Rickettsia is observed in ∼0.41% of barcode submissions and is more likely to be found than Wolbachia (0.17%). The Torix group of Rickettsia are shown to account for 95% of all unintended amplifications from the genus. A further targeted PCR screen of 1,612 individuals from 169 terrestrial and aquatic invertebrate species identified mostly Torix strains and supports the “aquatic hot spot” hypothesis for Torix infection. Furthermore, the analysis of 1,341 SRA deposits indicates that Torix infections represent a significant proportion of all Rickettsia symbioses found in arthropod genome projects. Conclusions This study supports a previous hypothesis that suggests that Torix Rickettsia are overrepresented in aquatic insects. In addition, multiple methods reveal further putative hot spots of Torix Rickettsia infection, including in phloem-feeding bugs, parasitoid wasps, spiders, and vectors of disease. The unknown host effects and transmission strategies of these endosymbionts make these newly discovered associations important to inform future directions of investigation involving the understudied Torix Rickettsia.


Author(s):  
Jianglin Feng ◽  
Nathan C Sheffield

Abstract Summary Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions. Availability https://github.com/databio/IGD


Author(s):  
Lê Đức Hậu ◽  
Quynh Diep Nguyen

Omics data (e.g., genomics, transcriptomics, proteomics, epigenomics, etc . . . ) generated from high-throughputnext-generation sequencers in the big human genome, andcancer genome projects have changed the way to studypersonalized medicine. In the future, personalized medicinewill not be limited to diagnosis and treatment based on afew known disease-associated mutations on some genes, butwill rely on whole molecular characteristics of patients byintegrating their –omics data. In this study, we draw a bigpicture of personalized medicine research in cancer researchof the –omics data era, including –omics databases, challengesof data fusion to solve two major problems in personalizedmedicine, i.e., personalized diagnosis and treatment. Theseproblems are approached as patient stratification and drugresponse prediction based on the –omics data by computational methods.


2020 ◽  
Vol 49 (D1) ◽  
pp. D723-D733
Author(s):  
Supratim Mukherjee ◽  
Dimitri Stamatis ◽  
Jon Bertsch ◽  
Galina Ovchinnikova ◽  
Jagadish Chandrabose Sundaramurthi ◽  
...  

Abstract The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.


2020 ◽  
Author(s):  
Min-Hak Choi ◽  
Jang-il Sohn ◽  
Dohun Yi ◽  
A Vipin Menon ◽  
Yeon Jeong Kim ◽  
...  

ABSTRACTGenome rearrangements often result in copy number alterations of cancer-related genes and cause the formation of cancer-related fusion genes. Current structural variation (SV) callers, however, still produce massive numbers of false positives (FPs) and require high computational costs. Here, we introduce an ultra-fast and high-performing somatic SV detector, called ETCHING, that significantly reduces the mapping cost by filtering reads matched to pan-genome and normal k-mer sets. To reduce the number of FPs, ETCHING takes advantage of a Random Forest classifier that utilizes six breakend-related features. We systematically benchmarked ETCHING with other SV callers on reference SV materials, validated SV biomarkers, tumor and matched-normal whole genomes, and tumor-only targeted sequencing datasets. For all datasets, our SV caller was much faster (≥15X) than other tools without compromising performance or memory use. Our approach would provide not only the fastest method for largescale genome projects but also an accurate clinically practical means for real-time precision medicine.


Sign in / Sign up

Export Citation Format

Share Document