bioinformatic tools Latest Research Papers

Basics of high throughput sequencing Summary

Medycyna Weterynaryjna ◽

10.21521/mw.6594 ◽

2025 ◽

Vol 77 (11) ◽

pp. 6589-2025

Author(s):

ALEKSANDRA GIZA ◽

EWELINA IWAN ◽

ARKADIUSZ BOMBA ◽

DARIUSZ WASYL

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Clinical Sample ◽

Genetic Material ◽

Read Length ◽

Library Preparation ◽

Sequencing Platform ◽

Bioinformatic Tools ◽

Whole Process ◽

The Right

Sequencing can provide genomic characterisation of a specific organism, as well as of a whole environmental or clinical sample. High Throughput Sequencing (HTS) makes it possible to generate an enormous amount of genomic data at gradually decreasing costs and almost in real-time. HTS is used, among others, in medicine, veterinary medicine, microbiology, virology and epidemiology. The paper presents practical aspects of the HTS technology. It describes generations of sequencing, which vary in throughput, read length, accuracy and costs ̶ and thus are used for different applications. The stages of HTS, as well as their purposes and pitfalls, are presented: extraction of the genetic material, library preparation, sequencing and data processing. For success of the whole process, all stages need to follow strict quality control measurements. Choosing the right sequencing platform, proper sample and library preparation procedures, as well as adequate bioinformatic tools are crucial for high quality results.

Benchmarking software to predict antibiotic resistance phenotypes in shotgun metagenomes using simulated data

10.1101/2022.01.13.476279 ◽

2022 ◽

Author(s):

Emily F Wissel ◽

Brooke M Talbot ◽

Bjorn A Johnson ◽

Robert A Petit ◽

Vicki Hertzberg ◽

...

Keyword(s):

Antibiotic Resistance ◽

Open Source ◽

Simulated Data ◽

Metagenomic Data ◽

Clinical Samples ◽

Bacterial Strains ◽

Minimal Processing ◽

Genotypic Resistance ◽

Shotgun Metagenomics ◽

Bioinformatic Tools

The use of shotgun metagenomics for AMR detection is appealing because data can be generated from clinical samples with minimal processing. Detecting antimicrobial resistance (AMR) in clinical genomic data is an important epidemiological task, yet a complex bioinformatic process. Many software tools exist to detect AMR genes, but they have mostly been tested in their detection of genotypic resistance in individual bacterial strains. It is important to understand how well these bioinformatic tools detect AMR genes in shotgun metagenomic data. We developed a software pipeline, hAMRoaster (https://github.com/ewissel/hAMRoaster), for assessing accuracy of prediction of antibiotic resistance phenotypes. For evaluation purposes, we simulated a short read (Illumina) shotgun metagenomics community of eight bacterial pathogens with extensive antibiotic susceptibility testing profiles. We benchmarked nine open source bioinformatics tools for detecting AMR genes that 1) were conda or Docker installable, 2) had been actively maintained, 3) had an open source license, and 4) took FASTA or FASTQ files as input. Several metrics were calculated for each tool including sensitivity, specificity, and F1 at three coverage levels. This study revealed that tools were highly variable in sensitivity (0.25 - 0.99) and specificity (0.2 - 1) in detection of resistance in our synthetic FASTQ files despite similar databases and methods implemented. Tools performed similarly at all coverage levels (5x, 50x, 100x). Cohen’s kappa revealed low agreement across tools.

Computational Prediction of Bacteriophage Host Ranges

Microorganisms ◽

10.3390/microorganisms10010149 ◽

2022 ◽

Vol 10 (1) ◽

pp. 149

Author(s):

Cyril J. Versoza ◽

Susanne P. Pfeifer

Keyword(s):

Antibiotic Resistance ◽

Host Range ◽

High Throughput ◽

Gold Standard ◽

Computational Prediction ◽

Bioinformatic Tools ◽

Key Factor

Increased antibiotic resistance has prompted the development of bacteriophage agents for a multitude of applications in agriculture, biotechnology, and medicine. A key factor in the choice of agents for these applications is the host range of a bacteriophage, i.e., the bacterial genera, species, and strains a bacteriophage is able to infect. Although experimental explorations of host ranges remain the gold standard, such investigations are inherently limited to a small number of viruses and bacteria amendable to cultivation. Here, we review recently developed bioinformatic tools that offer a promising and high-throughput alternative by computationally predicting the putative host ranges of bacteriophages, including those challenging to grow in laboratory environments.

Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome

Genome Biology ◽

10.1186/s13059-021-02579-6 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Andre L. M. Reis ◽

Ira W. Deveson ◽

Bindu Swapna Madala ◽

Ted Wong ◽

Chris Barker ◽

...

Keyword(s):

Human Genome ◽

Comprehensive Evaluation ◽

Diagnostic Yield ◽

Ground Truth ◽

Low Complexity ◽

Systematic Bias ◽

Preparation Methods ◽

Synthetic Dna ◽

Bioinformatic Tools ◽

Sequencing Technologies

Abstract Background Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low-complexity, and repetitive regions that are difficult to sequence and analyze. Despite their difficulty, these regions include many clinically important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. Results To evaluate the accuracy by which these difficult regions are analyzed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically important human genome regions, including repeats, microsatellites, HLA genes, and immune receptors. These controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. Conclusions This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.

Profiling transcription factor activity dynamics using intronic reads in time-series transcriptome data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009762 ◽

2022 ◽

Vol 18 (1) ◽

pp. e1009762

Author(s):

Yan Wu ◽

Lingfeng Xue ◽

Wen Huang ◽

Minghua Deng ◽

Yihan Lin

Keyword(s):

Time Series ◽

T Cell Activation ◽

Cell Activation ◽

Dynamic Processes ◽

Transcriptome Data ◽

Bioinformatic Tools ◽

Cellular Processes ◽

Recent Developments ◽

Level Information

Activities of transcription factors (TFs) are temporally modulated to regulate dynamic cellular processes, including development, homeostasis, and disease. Recent developments of bioinformatic tools have enabled the analysis of TF activities using transcriptome data. However, because these methods typically use exon-based target expression levels, the estimated TF activities have limited temporal accuracy. To address this, we proposed a TF activity measure based on intron-level information in time-series RNA-seq data, and implemented it to decode the temporal control of TF activities during dynamic processes. We showed that TF activities inferred from intronic reads can better recapitulate instantaneous TF activities compared to the exon-based measure. By analyzing public and our own time-series transcriptome data, we found that intron-based TF activities improve the characterization of temporal phasing of cycling TFs during circadian rhythm, and facilitate the discovery of two temporally opposing TF modules during T cell activation. Collectively, we anticipate that the proposed approach would be broadly applicable for decoding global transcriptional architecture during dynamic processes.

Host microbiomes in tumor precision medicine: how far are we?

Current Medicinal Chemistry ◽

10.2174/0929867329666220105121754 ◽

2022 ◽

Vol 29 ◽

Author(s):

Federica D'Amico ◽

Monica Barone ◽

Teresa Tavella ◽

Simone Rampelli ◽

Patrizia Brigidi ◽

...

Keyword(s):

Precision Medicine ◽

Gut Microbiome ◽

Fecal Microbiota Transplantation ◽

Current Knowledge ◽

Safety Information ◽

Fecal Microbiota ◽

Response To Therapy ◽

Bioinformatic Tools ◽

Therapeutic Outcomes

Abstract: The human gut microbiome has received a crescendo of attention in recent years, due to the countless influences on human pathophysiology, including cancer. Research on cancer and anticancer therapy is constantly looking for new hints to improve the response to therapy while reducing the risk of relapse. In this scenario, the gut microbiome and the plethora of microbial-derived metabolites are considered a new opening in the development of innovative anticancer treatments for a better prognosis. This narrative review summarizes the current knowledge on the role of the gut microbiome in the onset and progression of cancer, as well as in response to chemo-immunotherapy. Recent findings regarding the tumor microbiome and its implications for clinical practice are also commented on. Current microbiome-based intervention strategies (i.e., prebiotics, probiotics, live biotherapeutics and fecal microbiota transplantation) are then discussed, along with key shortcomings, including a lack of long-term safety information in patients who are already severely compromised by standard treatments. The implementation of bioinformatic tools applied to microbiomics and other omics data, such as machine learning, has an enormous potential to push research in the field, enabling the prediction of health risk and therapeutic outcomes, for a truly personalized precision medicine.

Influence of sequencing depth on bacterial classification and abundance in bacterial communities

10.1101/2022.01.04.474922 ◽

2022 ◽

Author(s):

Fernando Mejia ◽

Francisco Avilés Jiménez ◽

Alfonso Méndez Tenorio

Keyword(s):

Relative Abundance ◽

Bacterial Communities ◽

Beta Diversity ◽

Sequencing Depth ◽

Diversity Analysis ◽

Rarefaction Analysis ◽

Form Of Life ◽

Bioinformatic Tools ◽

Number Of Species ◽

Bacterial Classification

Microbial diversity is the most abundant form of life. Next Generation Sequencing technologies provide the capacity to study complex bacterial communities, in which the depth and the bioinformatic tools can influence the results. In this work we explored two different protocols for bacterial classification and abundance evaluation, using 10 bacterial genomes in a simulated sample at different sequencing. Protocol A consisted of metagenome assembly with Megahit and Ray Meta and taxonomic classification with Kraken2 and Centrifuge. In protocol B only taxonomicclassification. In both protocols, rarefaction, relative abundance and beta diversity were analyzed. In the protocol A, Megahit had a mean contig length of 1,128 and Ray Meta de 8,893 nucleotides. The number of species correctly classified in all depth assays were 6 out of 10 for protocol A, and 9 out of 10 using protocol B. The rarefaction analysis showed an overestimation of the number of species in almost all assays regardless of the protocol, and the beta diversity analysis results indicated significant differences in all comparisons. Protocol A was more efficient for diversity analysis, while protocol B estimated a more precise relative abundance. Our results do not allow us to suggest an optimal sequencing depth at specie level.

Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data

Frontiers in Aging ◽

10.3389/fragi.2021.800380 ◽

2022 ◽

Vol 2 ◽

Author(s):

August Yue Huang ◽

Eunjung Alice Lee

Keyword(s):

Single Cell ◽

Somatic Mutations ◽

Sequencing Data ◽

Tissue Samples ◽

Genome Amplification ◽

Bioinformatic Tools ◽

Base Calling ◽

Human Lifespan ◽

Dna Variants ◽

Underlying Mechanisms

Somatic mutations are DNA variants that occur after the fertilization of zygotes and accumulate during the developmental and aging processes in the human lifespan. Somatic mutations have long been known to cause cancer, and more recently have been implicated in a variety of non-cancer diseases. The patterns of somatic mutations, or mutational signatures, also shed light on the underlying mechanisms of the mutational process. Advances in next-generation sequencing over the decades have enabled genome-wide profiling of DNA variants in a high-throughput manner; however, unlike germline mutations, somatic mutations are carried only by a subset of the cell population. Thus, sensitive bioinformatic methods are required to distinguish mutant alleles from sequencing and base calling errors in bulk tissue samples. An alternative way to study somatic mutations, especially those present in an extremely small number of cells or even in a single cell, is to sequence single-cell genomes after whole-genome amplification (WGA); however, it is critical and technically challenging to exclude numerous technical artifacts arising during error-prone and uneven genome amplification in current WGA methods. To address these challenges, multiple bioinformatic tools have been developed. In this review, we summarize the latest progress in methods for identification of somatic mutations and the challenges that remain to be addressed in the future.

Homozygous mutation in SLO3 leads to severe asthenoteratozoospermia due to acrosome hypoplasia and mitochondrial sheath malformations

Reproductive Biology and Endocrinology ◽

10.1186/s12958-021-00880-4 ◽

2022 ◽

Vol 20 (1) ◽

Author(s):

Mingrong Lv ◽

Chunyu Liu ◽

Chunjie Ma ◽

Hui Yu ◽

Zhongmei Shao ◽

...

Keyword(s):

Membrane Potential ◽

Missense Variant ◽

Human Sperm ◽

Human Case ◽

Chinese Patient ◽

Affected Individual ◽

Homozygous Mutation ◽

Sequencing Analysis ◽

Bioinformatic Tools ◽

Ultrastructural Studies

Abstract Background Potassium channels are important for the structure and function of the spermatozoa. As a potassium transporter, the mSlo3 is essential for male fertility as Slo3 knockout male mice were infertile with the series of functional defects in sperm cells. However, no pathogenic variant has been detected in human SLO3 to date. Here we reported a human case with homozygous SLO3 mutation. The function of SLO3 in human sperm and the corresponding assisted reproductive strategy are also investigated. Methods We performed whole-exome sequencing analysis from a large cohort of 105 patients with asthenoteratozoospermia. The effects of the variant were investigated by quantitative RT-PCR, western blotting, and immunofluorescence assays using the patient spermatozoa. Sperm morphological and ultrastructural studies were conducted using haematoxylin and eosin staining, scanning and transmission electron microscopy. Results We identified a homozygous missense variant (c.1237A > T: p.Ile413Phe) in the sperm-specific SLO3 in one Chinese patient with male infertility. This SLO3 variant was rare in human control populations and predicted to be deleterious by multiple bioinformatic tools. Sperm from the individual harbouring the homozygous SLO3 variant exhibited severe morphological abnormalities, such as acrosome hypoplasia, disruption of the mitochondrial sheath, coiled tails, and motility defects. The levels of SLO3 mRNA and protein in spermatozoa from the affected individual were reduced. Furthermore, the acrosome reaction, mitochondrial membrane potential, and membrane potential during capacitation were also afflicted. The levels of acrosome marker glycoproteins and PLCζ1 as well as the mitochondrial sheath protein HSP60 and SLO3 auxiliary subunit LRRC52, were significantly reduced in the spermatozoa from the affected individual. The affected man was sterile due to acrosome and mitochondrial dysfunction; however, intra-cytoplasmic sperm injection successfully rescued this infertile condition. Conclusions SLO3 deficiency seriously impact acrosome formation, mitochondrial sheath assembly, and the function of K+ channels. Our findings provided clinical implications for the genetic and reproductive counselling of affected families.

Differential word expression analyses highlight plague dynamics during the second pandemic

Royal Society Open Science ◽

10.1098/rsos.210039 ◽

2022 ◽

Vol 9 (1) ◽

Author(s):

Rémi Barbieri ◽

Riccardo Nodari ◽

Michel Signoli ◽

Sara Epis ◽

Didier Raoult ◽

...

Keyword(s):

Yersinia Pestis ◽

Historical Records ◽

Interpretive Bias ◽

Bioinformatic Tools ◽

Potential Sources ◽

Negative Controls ◽

Nested Network ◽

Ancient Texts

Research on the second plague pandemic that swept over Europe from the fourteenth to nineteenth centuries mainly relies on the exegesis of contemporary texts and is prone to interpretive bias. By leveraging certain bioinformatic tools routinely used in biology, we developed a quantitative lexicography of 32 texts describing two major plague outbreaks, using contemporary plague-unrelated texts as negative controls. Nested, network and category analyses of a 207-word pan-lexicome, comprising overrepresented terms in plague-related texts, indicated that ‘buboes' and ‘carbuncles' are words that were significantly associated with the plague and signalled an ectoparasite-borne plague. Moreover, plague-related words were associated with the terms ‘merchandise’, ‘movable’, ‘tatters', ‘bed’ and ‘clothes'. Analysing ancient texts using the method reported in this paper can certify plague-related historical records and indicate the particularities of each plague outbreak, which can inform on the potential sources for the causative Yersinia pestis .

bioinformatic tools
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Basics of high throughput sequencing Summary

Benchmarking software to predict antibiotic resistance phenotypes in shotgun metagenomes using simulated data

Computational Prediction of Bacteriophage Host Ranges

Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome

Profiling transcription factor activity dynamics using intronic reads in time-series transcriptome data

Host microbiomes in tumor precision medicine: how far are we?

Influence of sequencing depth on bacterial classification and abundance in bacterial communities

Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data

Homozygous mutation in SLO3 leads to severe asthenoteratozoospermia due to acrosome hypoplasia and mitochondrial sheath malformations

Differential word expression analyses highlight plague dynamics during the second pandemic

Export Citation Format

bioinformatic toolsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Basics of high throughput sequencing Summary

Benchmarking software to predict antibiotic resistance phenotypes in shotgun metagenomes using simulated data

Computational Prediction of Bacteriophage Host Ranges

Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome

Profiling transcription factor activity dynamics using intronic reads in time-series transcriptome data

Host microbiomes in tumor precision medicine: how far are we?

Influence of sequencing depth on bacterial classification and abundance in bacterial communities

Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data

Homozygous mutation in SLO3 leads to severe asthenoteratozoospermia due to acrosome hypoplasia and mitochondrial sheath malformations

Differential word expression analyses highlight plague dynamics during the second pandemic

bioinformatic tools
Recently Published Documents