scholarly journals Roary: rapid large-scale prokaryote pan genome analysis

2015 ◽  
Vol 31 (22) ◽  
pp. 3691-3693 ◽  
Author(s):  
Andrew J. Page ◽  
Carla A. Cummins ◽  
Martin Hunt ◽  
Vanessa K. Wong ◽  
Sandra Reuter ◽  
...  
GigaScience ◽  
2018 ◽  
Vol 7 (4) ◽  
Author(s):  
Harry A Thorpe ◽  
Sion C Bayliss ◽  
Samuel K Sheppard ◽  
Edward J Feil

2015 ◽  
Author(s):  
Andrew J Page ◽  
Carla A Cummins ◽  
Martin Hunt ◽  
Vanessa K Wong ◽  
Sandra Reuter ◽  
...  

A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and dispensable accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors.


2022 ◽  
Author(s):  
Tang Li ◽  
Yanbin Yin

Background: Large scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of MAGs have been produced and increasingly included in pan-genomics workflow. However, pan-genome analyses of MAGs may suffer from the known issues with MAGs: fragmentation, incompleteness, and contamination, due to mis-assembly and mis-binning. Here, we conducted a critical assessment of including MAGs in pan-genome analysis, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. Results: We found that incompleteness led to more significant core gene loss than fragmentation. Contamination had little effect on core genome size but had major influence on accessory genomes. The core gene loss remained when using different pan-genome analysis tools and when using a mixture of MAGs and complete genomes. Importantly, the core gene loss was partially alleviated by lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The core gene loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Conclusions: We conclude that lowering core gene threshold and predicting genes in metagenome mode (as Anvio does with Prodigal) are necessary in pan-genome analysis of MAGs to alleviate the accuracy loss. Better quality control of MAGs and development of new pan-genome analysis tools specifically designed for MAGs are needed in future studies.


2020 ◽  
Author(s):  
Idowu Olawoye ◽  
Simon D.W. Frost ◽  
Christian T. Happi

Abstract Background: Mycobacterium tuberculosis complex (MTBC) consists of seven major lineages with three of them reported to circulate within West Africa: lineage 5 (West African 1) and lineage 6 (West African 2) which are geographically restricted to West Africa and lineage 4 (Euro-American lineage) which is found globally. It is unclear why the West African lineages are not found elsewhere; some hypotheses suggest that it could either be harboured by an animal reservoir which is restricted to West Africa, or strain preference for hosts of West African ethnicity, or inability to compete with other lineages in other locations.We tested the hypothesis that M. africanum West African 2 (lineage 6) might have emigrated out of West Africa but was outcompeted by more virulent modern strains of M. tuberculosis (MTB).Whole genome sequences of M. tuberculosis from Nigeria (n=21), South Africa (n=24) and M. africanum West African 2 from Mali (n=22) were retrieved, and a pan-genome analysis was performed after fully annotating these genomes. Results: The outcome of this analysis shows that Lineages 2, 4 and 6 all have a close pan-genome. We also see a correlation in numbers of some multiple copy core genes and amino acid substitution with lineage specificity that may have contributed to geographical distribution of these lineages.Conclusions: The findings in this study provides a perspective to one of the hypotheses that M. africanum West African 2 might find it difficult to compete against the more modern lineages outside West Africa hence its localization to the geographical region.


2015 ◽  
Vol 32 (4) ◽  
pp. 497-504 ◽  
Author(s):  
Uwe Baier ◽  
Timo Beller ◽  
Enno Ohlebusch

2016 ◽  
Vol 8 (2) ◽  
pp. 387-402 ◽  
Author(s):  
Emilie Dumas ◽  
Eva Christina Boritsch ◽  
Mathias Vandenbogaert ◽  
Ricardo C. Rodríguez de la Vega ◽  
Jean-Michel Thiberge ◽  
...  

Genes ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 521 ◽  
Author(s):  
McCarthy ◽  
Fitzpatrick

Although the pan-genome concept originated in prokaryote genomics, an increasing number of eukaryote species pan-genomes have also been analysed. However, there is a relative lack of software intended for eukaryote pan-genome analysis compared to that available for prokaryotes. In a previous study, we analysed the pan-genomes of four model fungi with a computational pipeline that constructed pan-genomes using the synteny-dependent Pan-genome Ortholog Clustering Tool (PanOCT) approach. Here, we present a modified and improved version of that pipeline which we have called Pangloss. Pangloss can perform gene prediction for a set of genomes from a given species that the user provides, constructs and optionally refines a species pan-genome from that set using PanOCT, and can perform various functional characterisation and visualisation analyses of species pan-genome data. To demonstrate Pangloss’s capabilities, we constructed and analysed a species pan-genome for the oleaginous yeast Yarrowia lipolytica and also reconstructed a previously-published species pan-genome for the opportunistic respiratory pathogen Aspergillus fumigatus. Pangloss is implemented in Python, Perl and R and is freely available under an open source GPLv3 licence via GitHub.


2020 ◽  
Vol 63 ◽  
pp. 54-62 ◽  
Author(s):  
Yeji Kim ◽  
Changdai Gu ◽  
Hyun Uk Kim ◽  
Sang Yup Lee

Sign in / Sign up

Export Citation Format

Share Document