scholarly journals MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Carlos A. Ruiz-Perez ◽  
Roth E. Conrad ◽  
Konstantinos T. Konstantinidis

Abstract Background High-throughput sequencing has increased the number of available microbial genomes recovered from isolates, single cells, and metagenomes. Accordingly, fast and comprehensive functional gene annotation pipelines are needed to analyze and compare these genomes. Although several approaches exist for genome annotation, these are typically not designed for easy incorporation into analysis pipelines, do not combine results from different annotation databases or offer easy-to-use summaries of metabolic reconstructions, and typically require large amounts of computing power for high-throughput analysis not available to the average user. Results Here, we introduce MicrobeAnnotator, a fully automated, easy-to-use pipeline for the comprehensive functional annotation of microbial genomes that combines results from several reference protein databases and returns the matching annotations together with key metadata such as the interlinked identifiers of matching reference proteins from multiple databases [KEGG Orthology (KO), Enzyme Commission (E.C.), Gene Ontology (GO), Pfam, and InterPro]. Further, the functional annotations are summarized into Kyoto Encyclopedia of Genes and Genomes (KEGG) modules as part of a graphical output (heatmap) that allows the user to quickly detect differences among (multiple) query genomes and cluster the genomes based on their metabolic similarity. MicrobeAnnotator is implemented in Python 3 and is freely available under an open-source Artistic License 2.0 from https://github.com/cruizperez/MicrobeAnnotator. Conclusions We demonstrated the capabilities of MicrobeAnnotator by annotating 100 Escherichia coli and 78 environmental Candidate Phyla Radiation (CPR) bacterial genomes and comparing the results to those of other popular tools. We showed that the use of multiple annotation databases allows MicrobeAnnotator to recover more annotations per genome compared to faster tools that use reduced databases and is computationally efficient for use in personal computers. The output of MicrobeAnnotator can be easily incorporated into other analysis pipelines while the results of other annotation tools can be seemingly incorporated into MicrobeAnnotator to generate summary plots.

2020 ◽  
Author(s):  
Carlos A. Ruiz-Perez ◽  
Roth E. Conrad ◽  
Konstantinos T. Konstantinidis

AbstractSummaryHigh-throughput sequencing has increased the number of microbial genomes from isolates, single cells, and metagenomes available. To analyze and compare these genomes, fast and comprehensive annotation pipelines are needed. Although several approaches exist for genome annotation, these are typically not designed for easy incorporation into analysis pipelines, do not combine results from several annotation databases or offer easy-to-use summaries of metabolic reconstructions in a high-throughput mode. Here, we introduce MicrobeAnnotator, a fully automated pipeline for the comprehensive annotation of microbial genomes that combines results from several reference protein databases to reliably summarize the metabolic potential of the genomes based on KEGG modules.AvailabilityMicrobeAnnotator is implemented in Python and is freely available under an open source Artistic Licence 2.0 from https://github.com/cruizperez/[email protected]; [email protected]


Molecules ◽  
2018 ◽  
Vol 23 (8) ◽  
pp. 1869 ◽  
Author(s):  
Stefano Dugheri ◽  
Alessandro Bonari ◽  
Matteo Gentili ◽  
Giovanni Cappelli ◽  
Ilenia Pompilio ◽  
...  

High-throughput screening of samples is the strategy of choice to detect occupational exposure biomarkers, yet it requires a user-friendly apparatus that gives relatively prompt results while ensuring high degrees of selectivity, precision, accuracy and automation, particularly in the preparation process. Miniaturization has attracted much attention in analytical chemistry and has driven solvent and sample savings as easier automation, the latter thanks to the introduction on the market of the three axis autosampler. In light of the above, this contribution describes a novel user-friendly solid-phase microextraction (SPME) off- and on-line platform coupled with gas chromatography and triple quadrupole-mass spectrometry to determine urinary metabolites of polycyclic aromatic hydrocarbons 1- and 2-hydroxy-naphthalene, 9-hydroxy-phenanthrene, 1-hydroxy-pyrene, 3- and 9-hydroxy-benzoantracene, and 3-hydroxy-benzo[a]pyrene. In this new procedure, chromatography’s sensitivity is combined with the user-friendliness of N-tert-butyldimethylsilyl-N-methyltrifluoroacetamide on-fiber SPME derivatization using direct immersion sampling; moreover, specific isotope-labelled internal standards provide quantitative accuracy. The detection limits for the seven OH-PAHs ranged from 0.25 to 4.52 ng/L. Intra-(from 2.5 to 3.0%) and inter-session (from 2.4 to 3.9%) repeatability was also evaluated. This method serves to identify suitable risk-control strategies for occupational hygiene conservation programs.


Author(s):  
Yachao Qu ◽  
Yong Huang ◽  
Di Liu ◽  
Yinuo Huang ◽  
Zhiyi Zhang ◽  
...  

T lymphocytes are the most important immune cells that affect both the development and treatment of hepatitis B. We used high-throughput sequencing to determine the diversity in the V and J regions of the TCRβchain in 4 chronic hepatitis B patients before and after HBeAg seroconversion. Here, we demonstrate that the 4 patients expressedVβ12-4 at the highest frequencies of 10.6%, 9.2%, 17.5%, and 7.5%, andVβ28was the second most common, with frequencies of 7.8%, 6.7%, 5.3%, and 10.9%, respectively. No significant changes were observed following seroconversion. With regard to the Jβgene, Jβ2-1 was the most commonly expressed in the 4 patients at frequencies of 5.8%, 6.5%, 11.3%, and 7.3%, respectively. Analysis of the V-J region genes revealed several differences, including significant increases in the expression levels of V7-2-01-J2-1, V12-4-J1-1, and V28-1-J1-5 and a decrease in that of V19-01-J2-3. These results illustrate the presence of biased TCRVβand Jβgene expression in the chronic hepatitis B patients. TRBVβ12-4,Vβ28, Jβ2-1, V7-2-01-J2-1, V12-4-J1-1, and V28-1-J1-5 may be associated with the development and treatment of CHB.


2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13890 ◽  
Author(s):  
Changjin Hong ◽  
Solaiappan Manimaran ◽  
William Evan Johnson

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .


2021 ◽  
Author(s):  
Rosie Drinkwater ◽  
Elizabeth L. Clare ◽  
Arthur Y. C. Chung ◽  
Stephen J. Rossiter ◽  
Eleanor M. Slade

AbstractThe application of environmental DNA (eDNA) sampling in biodiversity surveys has gained widespread acceptance, especially in aquatic systems where free eDNA can be readily collected by filtering water. In terrestrial systems, eDNA-based approaches for assaying vertebrate biodiversity have tended to rely on blood-feeding invertebrates, including leeches and mosquitoes (termed invertebrate-derived DNA or iDNA). However, a key limitation of using blood-feeding taxa as samplers is that they are difficult to trap, and, in the case of leeches, are highly restricted to humid forest ecosystems. Dung beetles (superfamily Scarabaeoidea) feed on the faecal matter of terrestrial vertebrates and offer several potential benefits over blood-feeding invertebrates as samplers of vertebrate DNA. Importantly, these beetles can be easily captured in large numbers using simple, inexpensive baited traps; are globally distributed; and also occur in a wide range of biomes, allowing mammal diversity to be compared across habitats. In this exploratory study, we test the potential utility of dung beetles as vertebrate samplers by sequencing the mammal DNA contained within their guts. First, using a controlled feeding experiment, we show that mammalian DNA can be retrieved from the guts of large dung beetles (Catharsius renaudpauliani) for up to 10 hours after feeding. Second, by combining high-throughput sequencing of a multi-species assemblage of dung beetles with PCR replicates, we show that multiple mammal taxa can be identified with high confidence. By providing preliminary evidence that dung beetles can be used as a source of mammal DNA, our study highlights the potential for this widespread group to be used in future biodiversity monitoring surveys.


2021 ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear.Results: Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically.Conclusions: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


Viruses ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1296
Author(s):  
Jonathan Burnie ◽  
Vera A. Tang ◽  
Joshua A. Welsh ◽  
Arvin T. Persaud ◽  
Laxshaginee Thaya ◽  
...  

The HIV-1 glycoprotein spike (gp120) is typically the first viral antigen that cells encounter before initiating immune responses, and is often the sole target in vaccine designs. Thus, characterizing the presence of cellular antigens on the surfaces of HIV particles may help identify new antiviral targets or impact targeting of gp120. Despite the importance of characterizing proteins on the virion surface, current techniques available for this purpose do not support high-throughput analysis of viruses, and typically only offer a semi-quantitative assessment of virus-associated proteins. Traditional bulk techniques often assess averages of viral preparations, which may mask subtle but important differences in viral subsets. On the other hand, microscopy techniques, which provide detail on individual virions, are difficult to use in a high-throughput manner and have low levels of sensitivity for antigen detection. Flow cytometry is a technique that traditionally has been used for rapid, high-sensitivity characterization of single cells, with limited use in detecting viruses, since the small size of viral particles hinders their detection. Herein, we report the detection and surface antigen characterization of HIV-1 pseudovirus particles by light scattering and fluorescence with flow cytometry, termed flow virometry for its specific application to viruses. We quantified three cellular proteins (integrin α4β7, CD14, and CD162/PSGL-1) in the viral envelope by directly staining virion-containing cell supernatants without the requirement of additional processing steps to distinguish virus particles or specific virus purification techniques. We also show that two antigens can be simultaneously detected on the surface of individual HIV virions, probing for the tetraspanin marker, CD81, in addition to α4β7, CD14, and CD162/PSGL-1. This study demonstrates new advances in calibrated flow virometry as a tool to provide sensitive, high-throughput characterization of the viral envelope in a more efficient, quantitative manner than previously reported techniques.


RNA Biology ◽  
2013 ◽  
Vol 10 (7) ◽  
pp. 1087-1092 ◽  
Author(s):  
Jinyu Wu ◽  
Qi Liu ◽  
Xin Wang ◽  
Jiayong Zheng ◽  
Tao Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document