scholarly journals rPinecone: Define sub-lineages of a clonal expansion via a phylogenetic tree

2018 ◽  
Author(s):  
Alexander M. Wailan ◽  
Francesc Coll ◽  
Eva Heinz ◽  
Gerry Tonkin-Hill ◽  
Jukka Corander ◽  
...  

ABSTRACTThe ability to distinguish between pathogens is a fundamental requirement to understand the epidemiology of infectious diseases. Phylogenetic analysis of genomic data can provide a powerful platform to identify lineages within bacterial populations, and thus inform outbreak investigation and transmission dynamics. However, resolving differences between pathogens associated with low variant (LV) populations carrying low median pairwise single nucleotide variant (SNV) distances, remains a major challenge. Here we present rPinecone, an R package designed to define sub-lineages within closely related LV populations. rPinecone uses a root-to-tip directional approach to define sub-lineages within a phylogenetic tree according to SNV distance from the ancestral node. The utility of this program was demonstrated using genomic data of two LV populations: a hospital outbreak of methicillin-resistant Staphylococcus aureus and endemic Salmonella Typhi from rural Cambodia. rPinecone identified the transmission branches of the hospital outbreak and geographically-confined lineages in Cambodia. Sub-lineages identified by rPinecone in both analyses were phylogenetically robust. It is anticipated that rPinecone can be used to discriminate between lineages of bacteria from LV populations where other methods fail, enabling a deeper understanding of infectious disease epidemiology for public health purposes.DATA SUMMARYSource code for rPinecone is available on GitHub under the open source licence GNU GPL 3; (url: https://github.com/alexwailan/rpinecone).Newick format files for both phylogenetic trees have been deposited in Figshare; (url: https://doi.org/10.6084/m9.figshare.7022558)Geographical analysis of the S. Typhi Dataset using Microreact is available at https://microreact.org/project/r1IqkrN1X.Accession numbers, meta data and sample lineage results of both datasets used in this paper are listed in the supplementary tables.I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ⊠IMPACT STATEMENTWhole genome sequence data from bacterial pathogens is increasingly used in the epidemiological investigation of infectious disease, both in outbreak and endemic situations. However, distinguishing bacterial species which are both very similar and which are likely to come from a small geographical and temporal range presents a major technical challenge for epidemiologists. rPinecone was designed to address this challenge and utilises phylogenetic data to define lineages within bacterial populations that have limited variation. This approach is therefore of great interest to epidemiologists as it adds a further level of clarity above and beyond that which is offered by existing approaches which have not been designed to consider bacterial isolates containing variation that only transiently exist, but which is epidemiologically informative. rPinecone has the flexibility to be applied to multiple pathogens and has direct application for investigations of clinical outbreaks and endemic disease to understand transmission dynamics or geographical hotspots of disease.


2017 ◽  
Author(s):  
Mickael Silva ◽  
Miguel Machado ◽  
Diogo N. Silva ◽  
Mirko Rossi ◽  
Jacob Moran-Gilad ◽  
...  

ABSTRACTGene-by-gene approaches are becoming increasingly popular in bacterial genomic epidemiology and outbreak detection. However, there is a lack of open-source scalable software for schema definition and allele calling for these methodologies. The chewBBACA suite was designed to assist users in the creation and evaluation of novel whole-genome or core-genome gene-by-gene typing schemas and subsequent allele calling in bacterial strains of interest. The software can run in a laptop or in high performance clusters making it useful for both small laboratories and large reference centers. ChewBBACA is available athttps://github.com/B-UMMI/chewBBACAor as a docker image athttps://hub.docker.com/r/ummidock/chewbbaca/.DATA SUMMARYAssembled genomes used for the tutorial were downloaded from NCBI in August 2016 by selecting those submitted asStreptococcus agalactiaetaxon or sub-taxa. All the assemblies have been deposited as a zip file in FigShare (https://figshare.com/s/9cbe1d422805db54cd52), where a file with the original ftp link for each NCBI directory is also available.Code for the chewBBACA suite is available athttps://github.com/B-UMMI/chewBBACAwhile the tutorial example is found athttps://github.com/B-UMMI/chewBBACA_tutorial.I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ⊠IMPACT STATEMENTThe chewBBACA software offers a computational solution for the creation, evaluation and use of whole genome (wg) and core genome (cg) multilocus sequence typing (MLST) schemas. It allows researchers to develop wg/cgMLST schemes for any bacterial species from a set of genomes of interest. The alleles identified by chewBBACA correspond to potential coding sequences, possibly offering insights into the correspondence between the genetic variability identified and phenotypic variability. The software performs allele calling in a matter of seconds to minutes per strain in a laptop but is easily scalable for the analysis of large datasets of hundreds of thousands of strains using multiprocessing options. The chewBBACA software thus provides an efficient and freely available open source solution for gene-by-gene methods. Moreover, the ability to perform these tasks locally is desirable when the submission of raw data to a central repository or web services is hindered by data protection policies or ethical or legal concerns.



2016 ◽  
Author(s):  
Gregorio Iraola ◽  
Hugo Naya

Taxonomy of prokaryotes has remained a controversial discipline due to the extreme plasticity of microorganisms, causing inconsistencies between phenotypic and genotypic classifications. The genomics era has enhanced taxonomy but also opened new debates about the best practices for incorporating genomic data into polyphasic taxonomy protocols, which are fairly biased towards the identification of bacterial species. Here we use an extensive dataset of Archaea and Bacteria to prove that metabolic signatures coded in their genomes are informative traits that allow to accurately classify organisms coherently to higher taxonomic ranks, and to associate functional features with the definition of taxa. Our results support the ecological coherence of higher taxonomic ranks and reconciles taxonomy with traditional chemotaxonomic traits inferred from genomes. KARL, a simple and free tool useful for assisting polyphasic taxonomy or to perform functional prospections is also presented (https://github.com/giraola/KARL).



2017 ◽  
Author(s):  
Lennard Epping ◽  
Andries J. van Tonder ◽  
Rebecca A. Gladstone ◽  
Stephen D. Bentley ◽  
Andrew J. Page ◽  
...  

ABSTRACTStreptococcus pneumoniae is responsible for 240,000 - 460,000 deaths in children under 5 years of age each year. Accurate identification of pneumococcal serotypes is important for tracking the distribution and evolution of serotypes following the introduction of effective vaccines. Recent efforts have been made to infer serotypes directly from genomic data but current software approaches are limited and do not scale well. Here, we introduce a novel method, SeroBA, which uses a hybrid assembly and mapping approach. We compared SeroBA against real and simulated data and present results on the concordance and computational performance against a validation dataset, the robustness and scalability when analysing a large dataset, and the impact of varying the depth of coverage in the cps locus region on sequence-based serotyping. SeroBA can predict serotypes, by identifying the cps locus, directly from raw whole genome sequencing read data with 98% concordance using a k-mer based method, can process 10,000 samples in just over 1 day using a standard server and can call serotypes at a coverage as low as 10x. SeroBA is implemented in Python3 and is freely available under an open source GPLv3 license from: https://github.com/sanger-pathogens/seroba.DATA SUMMARYThe reference genome Streptococcus pneumoniae ATCC 700669 is available from National Center for Biotechnology Information (NCBI) with the accession number: FM211187Simulated paired end reads for experiment 2 have been deposited in FigShare: https://doi.org/10.6084/m9.figshare.5086054.v1Accession numbers for all other experiments are listed in Supplementary Table S1 and Supplementary Table S2.I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ⊠IMPACT STATEMENTThis article describes SeroBA, a A-mer based method for predicting the serotypes of Streptococcus pneumoniae from Whole Genome Sequencing (WGS) data. SeroBA can identify 92 serotypes and 2 subtypes with constant memory usage and low computational costs. We showed that SeroBA is able to reliably predict serotypes at a depth of coverage as low as 10x and is scalable to large datasets.



PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e2997 ◽  
Author(s):  
Markus Zojer ◽  
Lisa N. Schuster ◽  
Frederik Schulz ◽  
Alexander Pfundner ◽  
Matthias Horn ◽  
...  

Genomic heterogeneity of bacterial species is observed and studied in experimental evolution experiments and clinical diagnostics, and occurs as micro-diversity of natural habitats. The challenge for genome research is to accurately capture this heterogeneity with the currently used short sequencing reads. Recent advances in NGS technologies improved the speed and coverage and thus allowed for deep sequencing of bacterial populations. This facilitates the quantitative assessment of genomic heterogeneity, including low frequency alleles or haplotypes. However, false positive variant predictions due to sequencing errors and mapping artifacts of short reads need to be prevented. We therefore created VarCap, a workflow for the reliable prediction of different types of variants even at low frequencies. In order to predict SNPs, InDels and structural variations, we evaluated the sensitivity and accuracy of different software tools using synthetic read data. The results suggested that the best sensitivity could be reached by a union of different tools, however at the price of increased false positives. We identified possible reasons for false predictions and used this knowledge to improve the accuracy by post-filtering the predicted variants according to properties such as frequency, coverage, genomic environment/localization and co-localization with other variants. We observed that best precision was achieved by using an intersection of at least two tools per variant. This resulted in the reliable prediction of variants above a minimum relative abundance of 2%. VarCap is designed for being routinely used within experimental evolution experiments or for clinical diagnostics. The detected variants are reported as frequencies within a VCF file and as a graphical overview of the distribution of the different variant/allele/haplotype frequencies. The source code of VarCap is available at https://github.com/ma2o/VarCap. In order to provide this workflow to a broad community, we implemeted VarCap on a Galaxy webserver, which is accessible at http://galaxy.csb.univie.ac.at.



Microbiology ◽  
2020 ◽  
Vol 166 (10) ◽  
pp. 995-1003 ◽  
Author(s):  
Laura M. Nolan ◽  
Lynne Turnbull ◽  
Marilyn Katrib ◽  
Sarah R. Osvath ◽  
Davide Losa ◽  
...  

Natural transformation is a mechanism that enables competent bacteria to acquire naked, exogenous DNA from the environment. It is a key process that facilitates the dissemination of antibiotic resistance and virulence determinants throughout bacterial populations. Pseudomonas aeruginosa is an opportunistic Gram-negative pathogen that produces large quantities of extracellular DNA (eDNA) that is required for biofilm formation. P. aeruginosa has a remarkable level of genome plasticity and diversity that suggests a high degree of horizontal gene transfer and recombination but is thought to be incapable of natural transformation. Here we show that P. aeruginosa possesses homologues of all proteins known to be involved in natural transformation in other bacterial species. We found that P. aeruginosa in biofilms is competent for natural transformation of both genomic and plasmid DNA. Furthermore, we demonstrate that type-IV pili (T4P) facilitate but are not absolutely essential for natural transformation in P. aeruginosa .



2020 ◽  
Vol 70 (3) ◽  
pp. 1738-1750 ◽  
Author(s):  
Awa Diop ◽  
Khalid El Karkouri ◽  
Didier Raoult ◽  
Pierre-Edouard Fournier

Over recent years, genomic information has increasingly been used for prokaryotic species definition and classification. Genome sequence-based alternatives to the gold standard DNA–DNA hybridization (DDH) relatedness have been developed, notably average nucleotide identity (ANI), which is one of the most useful measurements for species delineation in the genomic era. However, the strictly intracellar lifestyle, the few measurable phenotypic properties and the low level of genetic heterogeneity made the current standard genomic criteria for bacterial species definition inapplicable to Rickettsia species. We evaluated a range of whole genome sequence (WGS)-based taxonomic parameters to develop guidelines for the classification of Rickettsia isolates at genus and species levels. By comparing the degree of similarity of 74 WGSs from 31 Rickettsia species and 61 WGSs from members of three closely related genera also belonging to the order Rickettsiales ( Orientia , 11 genomes; Ehrlichia , 22 genomes; and Anaplasma , 28 genomes) using digital DDH (dDDh) and ANI by orthology (OrthoANI) parameters, we demonstrated that WGS-based taxonomic information, which is easy to obtain and use, can serve for reliable classification of isolates within the Rickettsia genus and species. To be classified as a member of the genus Rickettsia , a bacterial isolate should exhibit OrthoANI values with any Rickettsia species with a validly published name of ≥83.63 %. To be classified as a new Rickettsia species, an isolate should not exhibit more than any of the following degrees of genomic relatedness levels with the most closely related species: >92.30 and >99.19 % for the dDDH and OrthoANI values, respectively. When applied to four rickettsial isolates of uncertain status, the above-described thresholds enabled their classification as new species in one case. Thus, we propose WGS-based guidelines to efficiently delineate Rickettsia species, with OrthoANI and dDDH being the most accurate for classification at the genus and species levels, respectively.



2021 ◽  
Author(s):  
Nicholas D. Youngblut ◽  
Ruth E. Ley

AbstractMapping metagenome reads to reference databases is the standard approach for assessing microbial taxonomic and functional diversity from metagenomic data. However, public reference databases often lack recently generated genomic data such as metagenome-assembled genomes (MAGs), which can limit the sensitivity of read-mapping approaches. We previously developed the Struo pipeline in order to provide a straight-forward method for constructing custom databases; however, the pipeline does not scale well with the ever-increasing number of publicly available microbial genomes. Moreover, the pipeline does not allow for efficient database updating as new data are generated. To address these issues, we developed Struo2, which is >3.5-fold faster than Struo at database generation and can also efficiently update existing databases. We also provide custom Kraken2, Bracken, and HUMAnN3 databases that can be easily updated with new genomes and/or individual gene sequences. Struo2 enables feasible database generation for continually increasing large-scale genomic datasets.AvailabilityStruo2: https://github.com/leylabmpi/Struo2Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump



Trials ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ubiratan Cardinalli Adler ◽  
Maristela Schiabel Adler ◽  
Livia Mitchiguian Hotta ◽  
Ana Elisa Madureira Padula ◽  
Amarilys de Toledo Cesar ◽  
...  

Abstract Objectives To investigate the effectiveness and safety of homeopathic medicine Natrum muriaticum (LM2) for mild cases of COVID-19 in Primary Health Care. Trial design A randomized, two-armed (1:1), parallel, placebo-controlled, double-blind, clinical trial is being performed to test the following hypotheses: H0: homeopathic medicines = placebo (null hypothesis) vs. H1: homeopathic medicines ≠ placebo (alternative hypothesis) for mild cases of COVID-19 in Primary Care. Participants Setting: Primary Care of São Carlos – São Paulo – Brazil. One hundred participants aged 18 years or older, with Influenza-like symptoms and a positive RT-PCR for SARS-CoV-2. Willingness to give informed consent and to comply with the study procedures is also required. Exclusion criterium: severe acute respiratory syndrome. Intervention and comparator Homeopathy: 1 globule of Natrum muriaticum LM2 diluted in 20 mL of alcohol 30% and dispensed in a 30 ml bottle. Placebo: 20 mL of alcohol 30% dispensed in a 30 ml bottle. Posology: one drop taken orally every 4 hours (6 doses/day) while there is fever, cough, tiredness, or pain (headache, sore throat, muscle aches, chest pain, etc.) followed by one drop every 6 hours (4 doses/day) until the fourteenth day of use. The bottle of study medication should be submitted to 10 vigorous shakes (succussions) before each dose. Posology may be changed by telemedicine, with no break in blinding. Study medication should be maintained during home isolation. According to the Primary Care protocol, the home isolation period lasts until the 10th day after the appearance of the first symptom, or up to 72 hours without symptoms. Main outcomes The primary endpoint will be time to recovery, defined as the number of days elapsed before all COVID-19 Influenza-like symptoms are recorded as mild or absent during home isolation period. Secondary measures are recovery time for each COVID-19 symptom; score of the scale created for the study (COVID-Simile Scale); medicines used during follow-up; number of days of follow-up; number of visits to emergency services; number of hospitalizations; other symptoms and Adverse Events during home isolation period. Randomisation The study Statistician generated a block randomization list, using a 1:1 ratio of the two groups (denoted as A and B) and a web-based tool (http://www.random.org/lists). Blinding (masking) The clinical investigators, the statistician, the Primary Care teams, the study collaborators, and the participants will remain blinded from the identity of the two treatment groups until the end of the study. Numbers to be randomised (sample size) One hundred participants are planned to be randomized (1:1) to placebo (50) or homeopathy (50). Trial Status Protocol version/date May 21, 2020. Recruitment is ongoing. First participant was recruited/included on June 29,2020. Due to recruitment adaptations to Primary Care changes, the authors anticipate the trial will finish recruiting on April 10, 2021. Trial registration COVID-Simile Study was registered at the University Hospital Medical Information Network (UMIN - https://www.umin.ac.jp/ctr/index.htm) on June 1st, 2020, and the trial start date was June 15, 2020. Unique ID: UMIN000040602. Full protocol The full protocol is attached as an additional file, accessible from the Trials website (Additional file 1). In the interest in expediting dissemination of this material, the familiar formatting has been eliminated; this Letter serves as a summary of the key elements of the full protocol.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Merete Nørgaard Madsen ◽  
Maria Lange Kirkegaard ◽  
Thomas Martin Klebe ◽  
Charlotte Lorenzen Linnebjerg ◽  
Søren Martin Riis Villumsen ◽  
...  

Abstract Background Extended scope physiotherapists (ESP) are increasingly supplementing orthopaedic surgeons (OS) in diagnosing patients with musculoskeletal disorders. Studies have reported satisfactory diagnostic and treatment agreement between ESPs and OSs, but methodological study quality is generally low, and only few studies have evaluated inter-professional collaboration. Our aims were: 1) to evaluate agreement on diagnosis and treatment plan between ESPs and OSs examining patients with shoulder disorders, 2) to explore and evaluate their inter-professional collaboration. Methods In an orthopaedic outpatient shoulder clinic, 69 patients were examined independently twice on the same day by an ESP and an OS in random order. Primary and secondary diagnoses (nine categories) and treatment plan (five categories, combinations allowed) were registered by each professional and compared. Percentage of agreement and kappa-values were calculated. Two semi-structured focus-group interviews were performed with ESPs and OSs, respectively. Interviews were based on the theoretical concept of Relational Coordination, encompassing seven dimensions of communication and relationship among professionals. A thematic analysis was conducted. Results Agreement on primary diagnosis was 62% (95% CI: [50; 73]). ESPs and OSs agreed on the combination of diagnoses in 79% (95% CI: [70; 89]) of the cases. Partial diagnostic agreement (one professional’s primary diagnosis was also registered as either primary or secondary diagnosis by the other) was 96% (95% CI: [91; 100]). Across treatment categories, agreement varied between 68% (95% CI: [57; 79]) and 100%. In 43% (95% CI: [31; 54]) of the cases, ESP and OS had full concordance between treatment categories chosen, while they agreed on at least one recommendation in 96% (95% CI: [91; 100]). Positive statements of all dimensions of relational coordination were found. Three themes especially important in the inter-professional collaboration emerged: Close communication, equal and respectful relationship and professional skills. Conclusions In the majority of cases, the ESP and OS registered the same or partly the same diagnosis and treatment plan. Indications of a high relational coordination implying a good inter-professional collaboration were found. Our results support that ESPs and OSs can share the task of examining selected patients with shoulder disorders in an orthopaedic clinic. Trial registration ClinicalTrials.gov Identifier: NCT03343951. Registered 10 November 2017



2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A372-A373
Author(s):  
Ira Winer ◽  
Lucy Gilbert ◽  
Ulka Vaishampayan ◽  
Seth Rosen ◽  
Christopher Hoimes ◽  
...  

BackgroundALKS 4230 is a novel engineered cytokine that selectively targets the intermediate-affinity interleukin-2 receptor complex to activate CD8+ T cells and natural killer cells.1 The ARTISTRY-1 trial (NCT02799095) has shown encouraging efficacy and acceptable tolerability of ALKS 4230 among patients with advanced solid tumors.2 We report a detailed analysis of ovarian cancer (OC) patients who received combination therapy in ARTISTRY-1.MethodsARTISTRY-1 is an ongoing multicohort phase 1/2 trial exploring intravenous ALKS 4230 as monotherapy and combined with pembrolizumab. OC patients were enrolled into a cohort with mixed anti PD 1/L1 unapproved tumor types who had progressed on prior chemotherapy. OC patients received ALKS 4230 (3 µg/kg) on days 1–5 and pembrolizumab (200 mg) on day 1 of a 21 day cycle. Outcomes presented include antitumor activity (RECIST v1.1) and safety as of 7/24/2020. To evaluate changes in tumor microenvironment (TME), baseline and on-treatment biopsies were collected.ResultsFourteen heavily pretreated patients with OC were enrolled. Patients received a median of 5 (range, 2 11) prior regimens and all were previously treated with platinum based therapy. Among 13 evaluable patients with ≥1 assessment, 9 experienced disease control and 4 experienced disease progression; median treatment duration was approximately 7 weeks. Three patients experienced an objective response, including 1 complete response, 1 partial response (PR), and 1 unconfirmed PR; all were platinum resistant and negative for BRCA mutations. Five patients experienced tumor burden reductions (table 1). Treatment-related adverse events at the doses tested have generally been transient and manageable, with the majority being grade 1 and 2 in severity. Overall, based on preliminary data, the combination with ALKS 4230 did not demonstrate any additive toxicity to that already established with pembrolizumab alone. Additional safety and efficacy data are being collected in ongoing cohorts. In the monotherapy dose escalation portion of the study, ALKS 4230 alone increased markers of lymphocyte infiltration in 1 paired melanoma biopsy (1 of 1; on treatment at cycle 2); CD8+ T cell density and PD-L1 tumor proportion score increased 5.2- and 11 fold, respectively, supporting evidence that ALKS 4230 has immunostimulatory impact on the TME and providing rationale for combining ALKS 4230 with pembrolizumab (figure 1).Abstract 347 Table 1Summary of response observations among patients with ovarian cancerAbstract 347 Figure 1Increased markers of lymphocyte tumor infiltrationAn increase in CD3+CD8+ T cells (A, red = CD3; blue = CD8; purple = CD3+CD8+; teal = tumor marker), GranzymeB (B, red = CD8; green = granzymeB; yellow = granzymeB+CD8+; teal = tumor marker), and PD-L1 (C, red = PD-L1; blue = tumor marker) in the tumor microenvironment of a single patient was observed after the patient received monotherapy ALKS 4230ConclusionsThe combination of ALKS 4230, an investigational agent, and pembrolizumab demonstrates an acceptable safety profile and provides some evidence of tumor shrinkage and disease stabilization in some patients with heavily pretreated OC. This regimen could represent a new therapeutic option for these patients.AcknowledgementsThe authors would like to thank all of the patients who are participating in this trial and their families. The trial is sponsored by Alkermes, Inc. Medical writing and editorial support was provided by Parexel and funded by Alkermes, Inc.Trial RegistrationClinicalTrials. gov NCT02799095Ethics ApprovalThis trial was approved by Ethics and Institutional Review Boards (IRBs) at all trial sites; IRB reference numbers 16–229 (Dana-Farber Cancer Institute), MOD00003422/PH285316 (Roswell Park Comprehensive Cancer Center), 20160175 (Western IRB), i15-01394_MOD23 (New York University School of Medicine), TRIAL20190090 (Cleveland Clinic), and 0000097 (ADVARRA).ReferencesLopes JE, Fisher JL, Flick HL, Wang C, Sun L, Ernstoff MS, et al. ALKS 4230: a novel engineered IL-2 fusion protein with an improved cellular selectivity profile for cancer immunotherapy. J Immunother Cancer 2020;8:e000673. doi: 10.1136/jitc-2020-000673.Vaishampayan UN, Muzaffar J, Velcheti V, Winer I, Hoimes CJ, Rosen SD, et al. ALKS 4230 monotherapy and in combination with pembrolizumab (pembro) in patients (pts) with refractory solid tumors (ARTISTRY-1). Oral presentation at: European Society for Medical Oncology Annual Meeting; September 2020; virtual.



Sign in / Sign up

Export Citation Format

Share Document