Build a Bioinformatic Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

Build a Bioinformatics Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

10.21203/rs.2.21224/v1 ◽

2020 ◽

Author(s):

Hualin Liu ◽

Bingyue Xin ◽

Jinshui Zheng ◽

Hao Zhong ◽

Yun Yu ◽

...

Keyword(s):

Comparative Genomics ◽

Bioinformatics Analysis ◽

Gene Prediction ◽

Orthologous Gene ◽

Routine Analysis ◽

Simple Analysis ◽

Analysis Pipeline ◽

Operation Process ◽

Comparative Genomics Analysis ◽

Analysis Platform

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (http://bcam.hzau.edu.cn/linuxwgs.php).

Download Full-text

Build a Bioinformatics Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

10.21203/rs.2.21224/v2 ◽

2020 ◽

Cited By ~ 2

Author(s):

Hualin Liu ◽

Bingyue Xin ◽

Jinshui Zheng ◽

Hao Zhong ◽

Yun Yu ◽

...

Keyword(s):

Comparative Genomics ◽

Bioinformatics Analysis ◽

Gene Prediction ◽

Orthologous Gene ◽

Routine Analysis ◽

Simple Analysis ◽

Analysis Pipeline ◽

Operation Process ◽

Comparative Genomics Analysis ◽

Analysis Platform

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics or http://bcam.hzau.edu.cn/linuxwgs.php).

Download Full-text

Build a Bioinformatics Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

10.21203/rs.2.21224/v3 ◽

2020 ◽

Author(s):

Hualin Liu ◽

Bingyue Xin ◽

Jinshui Zheng ◽

Hao Zhong ◽

Yun Yu ◽

...

Keyword(s):

Comparative Genomics ◽

Bioinformatics Analysis ◽

Gene Prediction ◽

Orthologous Gene ◽

Routine Analysis ◽

Simple Analysis ◽

Analysis Pipeline ◽

Operation Process ◽

Comparative Genomics Analysis ◽

Analysis Platform

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics).

Download Full-text

Build a Bioinformatics Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

10.21203/rs.2.21224/v4 ◽

2020 ◽

Author(s):

Hualin Liu ◽

Bingyue Xin ◽

Jinshui Zheng ◽

Hao Zhong ◽

Yun Yu ◽

...

Keyword(s):

Comparative Genomics ◽

Bioinformatics Analysis ◽

Gene Prediction ◽

Orthologous Gene ◽

Routine Analysis ◽

Simple Analysis ◽

Analysis Pipeline ◽

Operation Process ◽

Comparative Genomics Analysis ◽

Analysis Platform

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (http://bcam.hzau.edu.cn/linuxwgs.php).

Download Full-text

Categorization of Orthologous Gene Clusters in 92 Ascomycota Genomes Reveals Functions Important for Phytopathogenicity

Journal of Fungi ◽

10.3390/jof7050337 ◽

2021 ◽

Vol 7 (5) ◽

pp. 337

Author(s):

Daniel Peterson ◽

Tang Li ◽

Ana M. Calvo ◽

Yanbin Yin

Keyword(s):

Comparative Genomics ◽

Orthologous Gene ◽

Gene Clusters ◽

Phytopathogenic Fungi ◽

Secreted Proteins ◽

Economic Losses ◽

Comparative Genomic ◽

Signal Peptides ◽

Comparative Genomics Analysis

Phytopathogenic Ascomycota are responsible for substantial economic losses each year, destroying valuable crops. The present study aims to provide new insights into phytopathogenicity in Ascomycota from a comparative genomic perspective. This has been achieved by categorizing orthologous gene groups (orthogroups) from 68 phytopathogenic and 24 non-phytopathogenic Ascomycota genomes into three classes: Core, (pathogen or non-pathogen) group-specific, and genome-specific accessory orthogroups. We found that (i) ~20% orthogroups are group-specific and accessory in the 92 Ascomycota genomes, (ii) phytopathogenicity is not phylogenetically determined, (iii) group-specific orthogroups have more enriched functional terms than accessory orthogroups and this trend is particularly evident in phytopathogenic fungi, (iv) secreted proteins with signal peptides and horizontal gene transfers (HGTs) are the two functional terms that show the highest occurrence and significance in group-specific orthogroups, (v) a number of other functional terms are also identified to have higher significance and occurrence in group-specific orthogroups. Overall, our comparative genomics analysis determined positive enrichment existing between orthogroup classes and revealed a prediction of what genomic characteristics make an Ascomycete phytopathogenic. We conclude that genes shared by multiple phytopathogenic genomes are more important for phytopathogenicity than those that are unique in each genome.

Download Full-text

Searching for a “Hidden” Prophage in a Marine Bacterium

Applied and Environmental Microbiology ◽

10.1128/aem.01450-09 ◽

2009 ◽

Vol 76 (2) ◽

pp. 589-595 ◽

Cited By ~ 20

Author(s):

Yanlin Zhao ◽

Kui Wang ◽

Hans-Wolfgang Ackermann ◽

Rolf U. Halden ◽

Nianzhi Jiao ◽

...

Keyword(s):

Sequence Similarity ◽

Genomic Analysis ◽

Bioinformatic Analysis ◽

Open Reading Frames ◽

Careful Examination ◽

Comparative Genomic ◽

Bacterial Genomes ◽

Genomic Analyses ◽

Experimental Laboratory ◽

Bacterial Genes

ABSTRACT Prophages are common in many bacterial genomes. Distinguishing putatively viable prophages from nonviable sequences can be a challenge, since some prophages are remnants of once-functional prophages that have been rendered inactive by mutational changes. In some cases, a putative prophage may be missed due to the lack of recognizable prophage loci. The genome of a marine roseobacter, Roseovarius nubinhibens ISM (hereinafter referred to as ISM), was recently sequenced and was reported to contain no intact prophage based on customary bioinformatic analysis. However, prophage induction experiments performed with this organism led to a different conclusion. In the laboratory, virus-like particles in the ISM culture increased more than 3 orders of magnitude following induction with mitomycin C. After careful examination of the ISM genome sequence, a putative prophage (ISM-pro1) was identified. Although this prophage contains only minimal phage-like genes, we demonstrated that this “hidden” prophage is inducible. Genomic analysis and reannotation showed that most of the ISM-pro1 open reading frames (ORFs) display the highest sequence similarity with Rhodobacterales bacterial genes and some ORFs are only distantly related to genes of other known phages or prophages. Comparative genomic analyses indicated that ISM-pro1-like prophages or prophage remnants are also present in other Rhodobacterales genomes. In addition, the lysis of ISM by this previously unrecognized prophage appeared to increase the production of gene transfer agents (GTAs). Our study suggests that a combination of in silico genomic analyses and experimental laboratory work is needed to fully understand the lysogenic features of a given bacterium.

Download Full-text

Web-Based Genome Analysis of Bacterial Meningitis Pathogens for Public Health Applications Using the Bacterial Meningitis Genomic Analysis Platform (BMGAP)

Frontiers in Genetics ◽

10.3389/fgene.2020.601870 ◽

2020 ◽

Vol 11 ◽

Author(s):

Sean A. Buono ◽

Reagan J. Kelly ◽

Nadav Topaz ◽

Adam C. Retchless ◽

Hideky Silva ◽

...

Keyword(s):

Public Health ◽

Bacterial Meningitis ◽

Sensitivity And Specificity ◽

Genome Analysis ◽

Bacterial Species ◽

Genomic Analysis ◽

Comparative Genomic ◽

Whole Genome ◽

Public Health Response ◽

Analysis Platform

Effective laboratory-based surveillance and public health response to bacterial meningitis depends on timely characterization of bacterial meningitis pathogens. Traditionally, characterizing bacterial meningitis pathogens such as Neisseria meningitidis (Nm) and Haemophilus influenzae (Hi) required several biochemical and molecular tests. Whole genome sequencing (WGS) has enabled the development of pipelines capable of characterizing the given pathogen with equivalent results to many of the traditional tests. Here, we present the Bacterial Meningitis Genomic Analysis Platform (BMGAP): a secure, web-accessible informatics platform that facilitates automated analysis of WGS data in public health laboratories. BMGAP is a pipeline comprised of several components, including both widely used, open-source third-party software and customized analysis modules for the specific target pathogens. BMGAP performs de novo draft genome assembly and identifies the bacterial species by whole-genome comparisons against a curated reference collection of 17 focal species including Nm, Hi, and other closely related species. Genomes identified as Nm or Hi undergo multi-locus sequence typing (MLST) and capsule characterization. Further typing information is captured from Nm genomes, such as peptides for the vaccine antigens FHbp, NadA, and NhbA. Assembled genomes are retained in the BMGAP database, serving as a repository for genomic comparisons. BMGAP’s species identification and capsule characterization modules were validated using PCR and slide agglutination from 446 bacterial invasive isolates (273 Nm from nine different serogroups, 150 Hi from seven different serotypes, and 23 from nine other species) collected from 2017 to 2019 through surveillance programs. Among the validation isolates, BMGAP correctly identified the species for all 440 isolates (100% sensitivity and specificity) and accurately characterized all Nm serogroups (99% sensitivity and 98% specificity) and Hi serotypes (100% sensitivity and specificity). BMGAP provides an automated, multi-species analysis pipeline that can be extended to include additional analysis modules as needed. This provides easy-to-interpret and validated Nm and Hi genome analysis capacity to public health laboratories and collaborators. As the BMGAP database accumulates more genomic data, it grows as a valuable resource for rapid comparative genomic analyses during outbreak investigations.

Download Full-text

FusoBase: an online Fusobacterium comparative genomic analysis platform

Database ◽

10.1093/database/bau082 ◽

2014 ◽

Vol 2014 (0) ◽

pp. bau082-bau082 ◽

Cited By ~ 3

Author(s):

M. Y. Ang ◽

H. Heydari ◽

N. S. Jakubovics ◽

M. I. Mahmud ◽

A. Dutta ◽

...

Keyword(s):

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Comparative Genomic ◽

Analysis Platform

Download Full-text

cano-wgMLST_BacCompare: A Bacterial Genome Analysis Platform for Epidemiological Investigation and Comparative Genomic Analysis

Frontiers in Microbiology ◽

10.3389/fmicb.2019.01687 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 3

Author(s):

Yen-Yi Liu ◽

Ji-Wei Lin ◽

Chih-Chieh Chen

Keyword(s):

Genome Analysis ◽

Bacterial Genome ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Comparative Genomic ◽

Epidemiological Investigation ◽

Analysis Platform

Download Full-text

Comparative Genomic Analysis of Fifty-Two Staphylococcus aureus Isolates Identified from Uncharacterized Staphylococcus Genomes in the NCBI database

10.1101/2020.06.18.159814 ◽

2020 ◽

Author(s):

Mohamed A. Abouelkhair

Keyword(s):

Staphylococcus Aureus ◽

Genome Sequence ◽

Food Poisoning ◽

Bacterial Pathogen ◽

Variant Calling ◽

Genomic Analysis ◽

Comparative Genomic ◽

Genome Sequences ◽

Spa Typing ◽

Genome Level

AbstractBackgroundStaphylococcus aureus is a major bacterial pathogen that causes a variety of diseases, ranging from wound infections to severe bacteremia or food poisoning. The course and severity of the disease are mainly dependent on the bacterium genotype as well as host factors. Whole-genome sequencing (WGS) is currently the most extensive genotyping method available, followed by bioinformatic sequence analysis.MethodsA total of 253 uncharacterized staphylococcus genome sequences were downloaded from the National Center for Biotechnology Information (NCBI) (August 2012 to March 2020) from different studies. Samples were clustered based on core and accessory pairwise distances between isolates and then analyzed by multilocus sequence typing tool (MLST). Staphylococcal Cassette Chromosome mec (SCCmec), spa typing, variant calling, core genome alignment, and recombination sites prediction were performed on detected S. aureus isolates. S. aureus isolates were also analyzed for the presence of genes coding for virulence factors and antibiotic resistance.Results and conclusionUncategorized genome sequences were clustered into 24 groups. About 182 uncharacterized Staphylococcus genomes were identified at the species level based on MLST, including 32 S. lugdunensis genome sequence, thus doubling the number of the publicly accessible S. lugdunensis genome sequence in Genbank. MLST identified another four species (S. epidermidis (33/253), S. lugdunensis (32/253), S. haemolyticus (41/253), S. hominis (24/253) and S. aureus (52/253)). Among the 52 S. aureus isolates, 21 (40.38%) isolates carried mecA gene, with 57.14% classified as SCCmec IV. The results of this study provide knowledge that facilitates evolutionary studies of staphylococcal species and other bacteria at the genome level.

Download Full-text