scholarly journals Build a Bioinformatic Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics

2021 ◽  
Author(s):  
Hualin Liu ◽  
Bingyue Xin ◽  
Jinshui Zheng ◽  
Hao Zhong ◽  
Yun Yu ◽  
...  

Abstract More and more frequently, genomics and comparative genomics have been used as routine methods for general microbiological research. However, using several tools or even writing some scripts are required for completing a simple analysis, which is complicated for most biological researchers. To simplify the operation process, particularly for the convenience of microbiologists, here we have developed PGCGAP, a comprehensive, malleable, and easily installed prokaryotic genomic and comparative genomic analysis pipeline. PGCGAP implements genome assembly, gene prediction and annotation, genome and metagenome distance estimation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variant calling and annotation, and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformaticians to master it. Therefore, a protocol was created to help microbiologists without any experience in bioinformatics to establish their bioinformatics platform and perform routine analyses. This protocol shows how to choose the equipment to install a Linux subsystem on a laptop with a Windows 10 system, to install the PGCGAP and perform all analyses with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics or http://bcam.hzau.edu.cn/linuxwgs.php).

2020 ◽  
Author(s):  
Hualin Liu ◽  
Bingyue Xin ◽  
Jinshui Zheng ◽  
Hao Zhong ◽  
Yun Yu ◽  
...  

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (http://bcam.hzau.edu.cn/linuxwgs.php).


Author(s):  
Hualin Liu ◽  
Bingyue Xin ◽  
Jinshui Zheng ◽  
Hao Zhong ◽  
Yun Yu ◽  
...  

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics or http://bcam.hzau.edu.cn/linuxwgs.php).


2020 ◽  
Author(s):  
Hualin Liu ◽  
Bingyue Xin ◽  
Jinshui Zheng ◽  
Hao Zhong ◽  
Yun Yu ◽  
...  

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics).


2020 ◽  
Author(s):  
Hualin Liu ◽  
Bingyue Xin ◽  
Jinshui Zheng ◽  
Hao Zhong ◽  
Yun Yu ◽  
...  

Abstract Genomics and comparative genomics have been increasingly used as routine methods for general microbiological researches. However, it is usually necessary to call several tools or even write some scripts to complete some simple analysis, which is complicated for most biological researchers. To simplify the operation process, especially for the convenience of microbiologists in the analysis, here we have developed PGCGAP, a comprehensive, malleable and easily-installed prokaryotic genomics and comparative genomics analysis pipeline, which implements genome assembly, gene prediction and annotation, average nucleotide identity (ANI) calculation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variants calling and annotation and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformatician users to master it. So, a protocol was created to help microbiologists without any experience in bioinformatics to establish their own bioinformatics platform and perform routine analysis. This protocol shows how to choose equipment, to install a Linux subsystem on a laptop with windows 10 system, to install PGCGAP and perform all analysis with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (http://bcam.hzau.edu.cn/linuxwgs.php).


2021 ◽  
Vol 7 (5) ◽  
pp. 337
Author(s):  
Daniel Peterson ◽  
Tang Li ◽  
Ana M. Calvo ◽  
Yanbin Yin

Phytopathogenic Ascomycota are responsible for substantial economic losses each year, destroying valuable crops. The present study aims to provide new insights into phytopathogenicity in Ascomycota from a comparative genomic perspective. This has been achieved by categorizing orthologous gene groups (orthogroups) from 68 phytopathogenic and 24 non-phytopathogenic Ascomycota genomes into three classes: Core, (pathogen or non-pathogen) group-specific, and genome-specific accessory orthogroups. We found that (i) ~20% orthogroups are group-specific and accessory in the 92 Ascomycota genomes, (ii) phytopathogenicity is not phylogenetically determined, (iii) group-specific orthogroups have more enriched functional terms than accessory orthogroups and this trend is particularly evident in phytopathogenic fungi, (iv) secreted proteins with signal peptides and horizontal gene transfers (HGTs) are the two functional terms that show the highest occurrence and significance in group-specific orthogroups, (v) a number of other functional terms are also identified to have higher significance and occurrence in group-specific orthogroups. Overall, our comparative genomics analysis determined positive enrichment existing between orthogroup classes and revealed a prediction of what genomic characteristics make an Ascomycete phytopathogenic. We conclude that genes shared by multiple phytopathogenic genomes are more important for phytopathogenicity than those that are unique in each genome.


2009 ◽  
Vol 76 (2) ◽  
pp. 589-595 ◽  
Author(s):  
Yanlin Zhao ◽  
Kui Wang ◽  
Hans-Wolfgang Ackermann ◽  
Rolf U. Halden ◽  
Nianzhi Jiao ◽  
...  

ABSTRACT Prophages are common in many bacterial genomes. Distinguishing putatively viable prophages from nonviable sequences can be a challenge, since some prophages are remnants of once-functional prophages that have been rendered inactive by mutational changes. In some cases, a putative prophage may be missed due to the lack of recognizable prophage loci. The genome of a marine roseobacter, Roseovarius nubinhibens ISM (hereinafter referred to as ISM), was recently sequenced and was reported to contain no intact prophage based on customary bioinformatic analysis. However, prophage induction experiments performed with this organism led to a different conclusion. In the laboratory, virus-like particles in the ISM culture increased more than 3 orders of magnitude following induction with mitomycin C. After careful examination of the ISM genome sequence, a putative prophage (ISM-pro1) was identified. Although this prophage contains only minimal phage-like genes, we demonstrated that this “hidden” prophage is inducible. Genomic analysis and reannotation showed that most of the ISM-pro1 open reading frames (ORFs) display the highest sequence similarity with Rhodobacterales bacterial genes and some ORFs are only distantly related to genes of other known phages or prophages. Comparative genomic analyses indicated that ISM-pro1-like prophages or prophage remnants are also present in other Rhodobacterales genomes. In addition, the lysis of ISM by this previously unrecognized prophage appeared to increase the production of gene transfer agents (GTAs). Our study suggests that a combination of in silico genomic analyses and experimental laboratory work is needed to fully understand the lysogenic features of a given bacterium.


2020 ◽  
Vol 11 ◽  
Author(s):  
Sean A. Buono ◽  
Reagan J. Kelly ◽  
Nadav Topaz ◽  
Adam C. Retchless ◽  
Hideky Silva ◽  
...  

Effective laboratory-based surveillance and public health response to bacterial meningitis depends on timely characterization of bacterial meningitis pathogens. Traditionally, characterizing bacterial meningitis pathogens such as Neisseria meningitidis (Nm) and Haemophilus influenzae (Hi) required several biochemical and molecular tests. Whole genome sequencing (WGS) has enabled the development of pipelines capable of characterizing the given pathogen with equivalent results to many of the traditional tests. Here, we present the Bacterial Meningitis Genomic Analysis Platform (BMGAP): a secure, web-accessible informatics platform that facilitates automated analysis of WGS data in public health laboratories. BMGAP is a pipeline comprised of several components, including both widely used, open-source third-party software and customized analysis modules for the specific target pathogens. BMGAP performs de novo draft genome assembly and identifies the bacterial species by whole-genome comparisons against a curated reference collection of 17 focal species including Nm, Hi, and other closely related species. Genomes identified as Nm or Hi undergo multi-locus sequence typing (MLST) and capsule characterization. Further typing information is captured from Nm genomes, such as peptides for the vaccine antigens FHbp, NadA, and NhbA. Assembled genomes are retained in the BMGAP database, serving as a repository for genomic comparisons. BMGAP’s species identification and capsule characterization modules were validated using PCR and slide agglutination from 446 bacterial invasive isolates (273 Nm from nine different serogroups, 150 Hi from seven different serotypes, and 23 from nine other species) collected from 2017 to 2019 through surveillance programs. Among the validation isolates, BMGAP correctly identified the species for all 440 isolates (100% sensitivity and specificity) and accurately characterized all Nm serogroups (99% sensitivity and 98% specificity) and Hi serotypes (100% sensitivity and specificity). BMGAP provides an automated, multi-species analysis pipeline that can be extended to include additional analysis modules as needed. This provides easy-to-interpret and validated Nm and Hi genome analysis capacity to public health laboratories and collaborators. As the BMGAP database accumulates more genomic data, it grows as a valuable resource for rapid comparative genomic analyses during outbreak investigations.


Database ◽  
2014 ◽  
Vol 2014 (0) ◽  
pp. bau082-bau082 ◽  
Author(s):  
M. Y. Ang ◽  
H. Heydari ◽  
N. S. Jakubovics ◽  
M. I. Mahmud ◽  
A. Dutta ◽  
...  

2020 ◽  
Author(s):  
Mohamed A. Abouelkhair

AbstractBackgroundStaphylococcus aureus is a major bacterial pathogen that causes a variety of diseases, ranging from wound infections to severe bacteremia or food poisoning. The course and severity of the disease are mainly dependent on the bacterium genotype as well as host factors. Whole-genome sequencing (WGS) is currently the most extensive genotyping method available, followed by bioinformatic sequence analysis.MethodsA total of 253 uncharacterized staphylococcus genome sequences were downloaded from the National Center for Biotechnology Information (NCBI) (August 2012 to March 2020) from different studies. Samples were clustered based on core and accessory pairwise distances between isolates and then analyzed by multilocus sequence typing tool (MLST). Staphylococcal Cassette Chromosome mec (SCCmec), spa typing, variant calling, core genome alignment, and recombination sites prediction were performed on detected S. aureus isolates. S. aureus isolates were also analyzed for the presence of genes coding for virulence factors and antibiotic resistance.Results and conclusionUncategorized genome sequences were clustered into 24 groups. About 182 uncharacterized Staphylococcus genomes were identified at the species level based on MLST, including 32 S. lugdunensis genome sequence, thus doubling the number of the publicly accessible S. lugdunensis genome sequence in Genbank. MLST identified another four species (S. epidermidis (33/253), S. lugdunensis (32/253), S. haemolyticus (41/253), S. hominis (24/253) and S. aureus (52/253)). Among the 52 S. aureus isolates, 21 (40.38%) isolates carried mecA gene, with 57.14% classified as SCCmec IV. The results of this study provide knowledge that facilitates evolutionary studies of staphylococcal species and other bacteria at the genome level.


Sign in / Sign up

Export Citation Format

Share Document