genome annotations
Recently Published Documents


TOTAL DOCUMENTS

142
(FIVE YEARS 74)

H-INDEX

18
(FIVE YEARS 4)

Author(s):  
Yijun Liu ◽  
Hongyang Zhang ◽  
Xiaojun Tang ◽  
Xuejun Jiang ◽  
Xiaojuan Yan ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection can cause gastrointestinal symptoms in the patients, but the role of gut microbiota in SARS-CoV-2 infection remains unclear. Thus, in this study, we aim to investigate whether SARS-CoV-2 infection affects the composition and function of gut microbiota. In this study, we demonstrated for the first time that significant shifts in microbiome composition and function were appeared in both SARS-CoV-2-infected asymptomatic and symptomatic cases. The relative abundance of Candidatus_Saccharibacteria was significantly increased, whereas the levels of Fibrobacteres was remarkably reduced in SARS-CoV-2-infected cases. There was one bacterial species, Spirochaetes displayed the difference between patients and asymptomatic cases. On the genus level, Tyzzerella was the key species that remarkably increased in both symptomatic and asymptomatic cases. Analyses of genome annotations further revealed SARS-CoV-2 infection resulted in the significant ‘functional dysbiosis’ of gut microbiota, including metabolic pathway, regulatory pathway and biosynthesis of secondary metabolites etc. We also identified potential metagenomic markers to discriminate SARS-CoV-2-infected symptomatic and asymptomatic cases from healthy controls. These findings together suggest gut microbiota is of possible etiological and diagnostic importance for SARS-CoV-2 infection.


2021 ◽  
Author(s):  
Zhicheng Zhang ◽  
Jing Guo ◽  
Xu Cai ◽  
Yufang Li ◽  
Xi Xi ◽  
...  

The species Brassica rapa includes several important vegetable crops. The draft reference genome of B. rapa ssp. pekinensis was completed in 2011, and it has since been updated twice. The pangenome with structural variations of 18 B. rapa accessions was published in 2021. Although extensive genomic analysis has been conducted on B. rapa, a comprehensive genome annotation including gene structure, alternative splicing events, and non-coding genes is still lacking. Therefore, we used the Pacific Biosciences (PacBio) single-molecular long-read technology to improve gene models and produced the annotated genome version 3.5. In total, we obtained 753,041 full-length non-chimeric (FLNC) reads and collapsed these into 92,810 non-redundant consensus isoforms, capturing 48% of the genes annotated in the B. rapa reference genome annotation v3.1. Based on the isoform data, we identified 830 novel protein-coding genes that were missed in previous genome annotations, defined the UTR regions of 20,340 annotated genes and corrected 886 wrongly-spliced genes. We also identified 28,564 alternative splicing (AS) events and 1,480 long non-coding RNAs (lncRNAs). We produced a relatively complete and high-quality reference transcriptome for B. rapa that can facilitate further functional genomic research.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Emily J. Shields ◽  
Masato Sorida ◽  
Lihong Sheng ◽  
Bogdan Sieriebriennikov ◽  
Long Ding ◽  
...  

Abstract Background Functional genomic analyses rely on high-quality genome assemblies and annotations. Highly contiguous genome assemblies have become available for a variety of species, but accurate and complete annotation of gene models, inclusive of alternative splice isoforms and transcription start and termination sites, remains difficult with traditional approaches. Results Here, we utilized full-length isoform sequencing (Iso-Seq), a long-read RNA sequencing technology, to obtain a comprehensive annotation of the transcriptome of the ant Harpegnathos saltator. The improved genome annotations include additional splice isoforms and extended 3′ untranslated regions for more than 4000 genes. Reanalysis of RNA-seq experiments using these annotations revealed several genes with caste-specific differential expression and tissue- or caste-specific splicing patterns that were missed in previous analyses. The extended 3′ untranslated regions afforded great improvements in the analysis of existing single-cell RNA-seq data, resulting in the recovery of the transcriptomes of 18% more cells. The deeper single-cell transcriptomes obtained with these new annotations allowed us to identify additional markers for several cell types in the ant brain, as well as genes differentially expressed across castes in specific cell types. Conclusions Our results demonstrate that Iso-Seq is an efficient and effective approach to improve genome annotations and maximize the amount of information that can be obtained from existing and future genomic datasets in Harpegnathos and other organisms.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Matthew R. Lueder ◽  
Regina Z. Cer ◽  
Miles Patrick ◽  
Logan J. Voegtly ◽  
Kyle A. Long ◽  
...  

Abstract Background Functional genome annotation is the process of labelling functional genomic regions with descriptive information. Manual curation can produce higher quality genome annotations than fully automated methods. Manual annotation efforts are time-consuming and complex; however, software can help reduce these drawbacks. Results We created Manual Annotation Studio (MAS) to improve the efficiency of the process of manual functional annotation prokaryotic and viral genomes. MAS allows users to upload unannotated genomes, provides an interface to edit and upload annotations, tracks annotation history and progress, and saves data to a relational database. MAS provides users with pertinent information through a simple point and click interface to execute and visualize results for multiple homology search tools (blastp, rpsblast, and HHsearch) against multiple databases (Swiss-Prot, nr, CDD, PDB, and an internally generated database). MAS was designed to accept connections over the local area network (LAN) of a lab or organization so multiple users can access it simultaneously. MAS can take advantage of high-performance computing (HPC) clusters by interfacing with SGE or SLURM and data can be exported from MAS in a variety of formats (FASTA, GenBank, GFF, and excel). Conclusions MAS streamlines and provides structure to manual functional annotation projects. MAS enhances the ability of users to generate, interpret, and compare results from multiple tools. The structure that MAS provides can improve project organization and reduce annotation errors. MAS is ideal for team-based annotation projects because it facilitates collaboration.


2021 ◽  
Author(s):  
Kriti Sengupta ◽  
Sai Suresh Hivarkar ◽  
Nikola Palevich ◽  
Prem Prashant Chaudhary ◽  
Prashant K. Dhakephalkar ◽  
...  

One cellulose-degrading strain CB08 and two xylan-degrading strains XB500-5 and X503 were isolated from buffalo rumen. All the strains were designated as putative novel species of Butyrivibrio based on phylogeny, phylogenomy, digital DNA-DNA hybridization, and average nucleotide identity with their closest type strains. The draft genome length of CB08 was ~3.54 Mb, while X503 and XB500-5 genome sizes were ~3.24 Mb and ~3.27 Mb, respectively. Only 68.28% of total orthologous clusters were shared among three genomes, and 40-44% of genes were identified as hypothetical proteins. The presence of genes encoding diverse carbohydrate-active enzymes (CAZymes) exhibited the lignocellulolytic potential of these strains. Further, the genome annotations revealed the metabolic pathways for monosaccharide fermentation to acetate, butyrate, lactate, ethanol, and hydrogen. The presence of genes for chemotaxis, antibiotic resistance, antimicrobial activity, synthesis of vitamins, and essential fatty acid suggested the versatile metabolic nature of these Butyrivibrio strains in the rumen environment.


2021 ◽  
Author(s):  
Srujana S. Yadavalli ◽  
Jing Yuan

Small membrane proteins represent a subset of recently discovered small proteins (≤100 amino acids), which are a ubiquitous class of emerging regulators underlying bacterial adaptation to environmental stressors. Until relatively recently, small open reading frames encoding these proteins were not designated as genes in genome annotations. Therefore, our understanding of small protein biology was primarily limited to a few candidates associated with previously characterized larger partner proteins. Following the first systematic analyses of small proteins in E. coli over a decade ago, numerous small proteins have been uncovered across different bacteria. An estimated one-third of these newly discovered proteins are localized to the cell membrane, where they may interact with distinct groups of membrane proteins such as signal receptors, transporters, and enzymes, and affect their activities. Recently, there has been considerable progress in functionally characterizing small membrane protein regulators aided by innovative tools adapted specifically to study small proteins. Our review covers prototypical proteins that modulate a broad range of cellular processes such as transport, signal transduction, stress response, respiration, cell division, sporulation as well as membrane stability. Thus, small membrane proteins represent a versatile group of regulators of physiology not just at the membrane but the whole cell. Additionally, small membrane proteins have the potential for clinical applications, where some of the proteins may act as antibacterial agents themselves, while others serve as alternative drug targets for the development of novel antimicrobials.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hatim Almutairi ◽  
Michael D. Urbaniak ◽  
Michelle D. Bates ◽  
Narissara Jariyapan ◽  
Godwin Kwakye-Nuako ◽  
...  

AbstractWe provide the raw and processed data produced during the genome sequencing of isolates from six species of parasites from the sub-family Leishmaniinae: Leishmania martiniquensis (Thailand), Leishmania orientalis (Thailand), Leishmania enriettii (Brazil), Leishmania sp. Ghana, Leishmania sp. Namibia and Porcisia hertigi (Panama). De novo assembly was performed using Nanopore long reads to construct chromosome backbone scaffolds. We then corrected erroneous base calling by mapping short Illumina paired-end reads onto the initial assembly. Data has been deposited at NCBI as follows: raw sequencing output in the Sequence Read Archive, finished genomes in GenBank, and ancillary data in BioSample and BioProject. Derived data such as quality scoring, SAM files, genome annotations and repeat sequence lists have been deposited in Lancaster University’s electronic data archive with DOIs provided for each item. Our coding workflow has been deposited in GitHub and Zenodo repositories. This data constitutes a resource for the comparative genomics of parasites and for further applications in general and clinical parasitology.


Author(s):  
Almut Heinken ◽  
Stefanía Magnúsdóttir ◽  
Ronan M T Fleming ◽  
Ines Thiele

Abstract Motivation Manual curation of genome-scale reconstructions is laborious, yet existing automated curation tools do not typically take species-specific experimental and curated genomic data into account. Results We developed DEMETER, a COBRA Toolbox extension, that enables the efficient, simultaneous refinement of thousands of draft genome-scale reconstructions, while ensuring adherence to the quality standards in the field, agreement with available experimental data, and refinement of pathways based on manually refined genome annotations. Availability DEMETER and tutorials are freely available at https://github.com/opencobra.


2021 ◽  
Author(s):  
Joel E. Richardson ◽  
Richard M. Baldarelli ◽  
Carol J. Bult

AbstractThe assembled and annotated genomes for 16 inbred mouse strains (Lilue et al., Nat Genet 50:1574–1583, 2018) and two wild-derived strains (CAROLI/EiJ and PAHARI/EiJ) (Thybert et al., Genome Res 28:448–459, 2018) are valuable resources for mouse genetics and comparative genomics. We developed the multiple genome viewer (MGV; http://www.informatics.jax.org/mgv) to support visualization, exploration, and comparison of genome annotations within and across these genomes. MGV displays chromosomal regions of user-selected genomes as horizontal tracks. Equivalent features across the genome tracks are highlighted using vertical ‘swim lane’ connectors. Navigation across the genomes is synchronized as a researcher uses the scroll and zoom functions. Researchers can generate custom sets of genes and other genome features to be displayed in MGV by entering genome coordinates, function, phenotype, disease, and/or pathway terms. MGV was developed to be genome agnostic and can be used to display homologous features across genomes of different organisms.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 758
Author(s):  
Michael J. Roach ◽  
Katelyn McNair ◽  
Sarah K Giles ◽  
Laura K Inglis ◽  
Evan Pargin ◽  
...  

Background Most bacterial genomes contain integrated bacteriophages—prophages—in various states of decay. Many are active and able to excise from the genome and replicate, while others are cryptic prophages, remnants of their former selves. Over the last two decades, many computational tools have been developed to identify the prophage components of bacterial genomes, and it is a particularly active area for the application of machine learning approaches. However, progress is hindered and comparisons thwarted because there are no manually curated bacterial genomes that can be used to test new prophage prediction algorithms. Methods We present a library of gold-standard bacterial genome annotations that include manually curated prophage annotations, and a computational framework to compare the predictions from different algorithms. We use this suite to compare all extant stand-alone prophage prediction algorithms to identify their strengths and weaknesses. We provide a FAIR dataset for prophage identification, and demonstrate the accuracy, precision, recall, and f1 score from the analysis of seven different algorithms for the prediction of prophages. Results We identified different strengths and weaknesses between the prophage prediction tools. Several tools exhibit exceptional f1 scores, while others have better recall at the expense of more false positives. The tools vary greatly in runtime performance with few exhibiting all desirable qualities for large-scale analyses. Conclusions Our library of gold-standard prophage annotations and benchmarking framework provide a valuable resource for exploring strengths and weaknesses of current and future prophage annotation tools. We discuss caveats and concerns in this analysis, how those concerns may be mitigated, and avenues for future improvements. This framework will help developers identify opportunities for improvement and test updates. It will also help users in determining the tools that are best suited for their analysis.


Sign in / Sign up

Export Citation Format

Share Document