bioinformatics pipeline
Recently Published Documents


TOTAL DOCUMENTS

415
(FIVE YEARS 274)

H-INDEX

19
(FIVE YEARS 9)

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Ludwig Mann ◽  
Kathrin M. Seibt ◽  
Beatrice Weber ◽  
Tony Heitkam

Abstract Background Extrachromosomal circular DNAs (eccDNAs) are ring-like DNA structures physically separated from the chromosomes with 100 bp to several megabasepairs in size. Apart from carrying tandemly repeated DNA, eccDNAs may also harbor extra copies of genes or recently activated transposable elements. As eccDNAs occur in all eukaryotes investigated so far and likely play roles in stress, cancer, and aging, they have been prime targets in recent research—with their investigation limited by the scarcity of computational tools. Results Here, we present the ECCsplorer, a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing techniques. Following Illumina-sequencing of amplified circular DNA (circSeq), the ECCsplorer enables an easy and automated discovery of eccDNA candidates. The data analysis encompasses two major procedures: first, read mapping to the reference genome allows the detection of informative read distributions including high coverage, discordant mapping, and split reads. Second, reference-free comparison of read clusters from amplified eccDNA against control sample data reveals specifically enriched DNA circles. Both software parts can be run separately or jointly, depending on the individual aim or data availability. To illustrate the wide applicability of our approach, we analyzed semi-artificial and published circSeq data from the model organisms Homo sapiens and Arabidopsis thaliana, and generated circSeq reads from the non-model crop plant Beta vulgaris. We clearly identified eccDNA candidates from all datasets, with and without reference genomes. The ECCsplorer pipeline specifically detected mitochondrial mini-circles and retrotransposon activation, showcasing the ECCsplorer’s sensitivity and specificity. Conclusion The ECCsplorer (available online at https://github.com/crimBubble/ECCsplorer) is a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing data. The derived eccDNA targets are valuable for a wide range of downstream investigations—from analysis of cancer-related eccDNAs over organelle genomics to identification of active transposable elements.


NAR Cancer ◽  
2022 ◽  
Vol 4 (1) ◽  
Author(s):  
Eirik Høye ◽  
Bastian Fromm ◽  
Paul H M Böttger ◽  
Diana Domanska ◽  
Annette Torgunrud ◽  
...  

ABSTRACT Although microRNAs (miRNAs) contribute to all hallmarks of cancer, miRNA dysregulation in metastasis remains poorly understood. The aim of this work was to reliably identify miRNAs associated with metastatic progression of colorectal cancer (CRC) using novel and previously published next-generation sequencing (NGS) datasets generated from 268 samples of primary (pCRC) and metastatic CRC (mCRC; liver, lung and peritoneal metastases) and tumor adjacent tissues. Differential expression analysis was performed using a meticulous bioinformatics pipeline, including only bona fide miRNAs, and utilizing miRNA-tailored quality control and processing. Five miRNAs were identified as up-regulated at multiple metastatic sites Mir-210_3p, Mir-191_5p, Mir-8-P1b_3p [mir-141–3p], Mir-1307_5p and Mir-155_5p. Several have previously been implicated in metastasis through involvement in epithelial-to-mesenchymal transition and hypoxia, while other identified miRNAs represent novel findings. The use of a publicly available pipeline facilitates reproducibility and allows new datasets to be added as they become available. The set of miRNAs identified here provides a reliable starting-point for further research into the role of miRNAs in metastatic progression.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Julia Vetter ◽  
Susanne Schaller ◽  
Andreas Heinzel ◽  
Constantin Aschauer ◽  
Roman Reindl-Schwaighofer ◽  
...  

Abstract Background Next-generation sequencing (NGS) is nowadays the most used high-throughput technology for DNA sequencing. Among others NGS enables the in-depth analysis of immune repertoires. Research in the field of T cell receptor (TCR) and immunoglobulin (IG) repertoires aids in understanding immunological diseases. A main objective is the analysis of the V(D)J recombination defining the structure and specificity of the immune repertoire. Accurate processing, evaluation and visualization of immune repertoire NGS data is important for better understanding immune responses and immunological behavior. Results ImmunoDataAnalyzer (IMDA) is a pipeline we have developed for automatizing the analysis of immunological NGS data. IMDA unites the functionality from carefully selected immune repertoire analysis software tools and covers the whole spectrum from initial quality control up to the comparison of multiple immune repertoires. It provides methods for automated pre-processing of barcoded and UMI tagged immune repertoire NGS data, facilitates the assembly of clonotypes and calculates key figures for describing the immune repertoire. These include commonly used clonality and diversity measures, as well as indicators for V(D)J gene segment usage and between sample similarity. IMDA reports all relevant information in a compact summary containing visualizations, calculations, and sample details, all of which serve for a more detailed overview. IMDA further generates an output file including key figures for all samples, designed to serve as input for machine learning frameworks to find models for differentiating between specific traits of samples. Conclusions IMDA constructs TCR and IG repertoire data from raw NGS reads and facilitates descriptive data analysis and comparison of immune repertoires. The IMDA workflow focus on quality control and ease of use for non-computer scientists. The provided output directly facilitates the interpretation of input data and includes information about clonality, diversity, clonotype overlap as well as similarity, and V(D)J gene segment usage. IMDA further supports the detection of sample swaps and cross-sample contamination that potentially occurred during sample preparation. In summary, IMDA reduces the effort usually required for immune repertoire data analysis by providing an automated workflow for processing raw NGS data into immune repertoires and subsequent analysis. The implementation is open-source and available on https://bioinformatics.fh-hagenberg.at/immunoanalyzer/.


Author(s):  
Ranine Ghamrawi ◽  
Igor Velickovic ◽  
Ognjen Milicevic ◽  
Wendy M. White ◽  
Lillian Rosa Thistlethwaite ◽  
...  

Background: We aimed to assess the extent to which the buffy coat DNA methylome is representative of methylation patterns in constitutive white blood cell (WBC) types in normal pregnancy.Methods: A comparison of differential methylation of buffy coat DNA vs DNA isolated from polymorphonuclear (PMN) and lymphocytic fractions was performed for each blood sample obtained within 24 h prior to delivery from 29 normotensive pregnant women. Methylation profiles were obtained using an Illumina Human Methylation 450 BeadChip and CHaMP bioinformatics pipeline. A subset of differentially methylated probes (DMPs) showing discordant methylation were further investigated using statistical modeling and enrichment analysis.Results: The smallest number of DMPs was found between the buffy coat and the PMN fraction (2.96%). Pathway enrichment analysis of the DMPs identified biological pathways involved in the particular leukocyte lineage, consistent with perturbations during isolation. The comparisons between the buffy coat and the isolated fractions as a group using linear modeling yielded a small number of probes (∼29,000) with discordant methylation. Demethylation of probes in the buffy coat compared to derived cell lines was more common and was prevalent in shelf and open sea regions.Conclusion: Buffy coat is representative of methylation patterns in WBC types in normal pregnancy. The differential methylations are consistent with perturbations during isolation of constituent cells and likely originate in vitro due to the physical stress during cell separation and are of no physiological relevance. These findings help the interpretation of DNA methylation profiling in pregnancy and numerous other conditions.


Life ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 69
Author(s):  
Davide Vacca ◽  
Antonino Fiannaca ◽  
Fabio Tramuto ◽  
Valeria Cancila ◽  
Laura La Paglia ◽  
...  

In consideration of the increasing prevalence of COVID-19 cases in several countries and the resulting demand for unbiased sequencing approaches, we performed a direct RNA sequencing (direct RNA seq.) experiment using critical oropharyngeal swab samples collected from Italian patients infected with SARS-CoV-2 from the Palermo region in Sicily. Here, we identified the sequences SARS-CoV-2 directly in RNA extracted from critical samples using the Oxford Nanopore MinION technology without prior cDNA retrotranscription. Using an appropriate bioinformatics pipeline, we could identify mutations in the nucleocapsid (N) gene, which have been reported previously in studies conducted in other countries. In conclusion, to the best of our knowledge, the technique used in this study has not been used for SARS-CoV-2 detection previously owing to the difficulties in the extraction of RNA of sufficient quantity and quality from routine oropharyngeal swabs. Despite these limitations, this approach provides the advantages of true native RNA sequencing and does not include amplification steps that could introduce systematic errors. This study can provide novel information relevant to the current strategies adopted in SARS-CoV-2 next-generation sequencing.


2022 ◽  
Vol 12 ◽  
Author(s):  
Zizhang Sheng ◽  
Jude S. Bimela ◽  
Phinikoula S. Katsamba ◽  
Saurabh D. Patel ◽  
Yicheng Guo ◽  
...  

Accumulation of somatic hypermutation (SHM) is the primary mechanism to enhance the binding affinity of antibodies to antigens in vivo. However, the structural basis of the effects of many SHMs remains elusive. Here, we integrated atomistic molecular dynamics (MD) simulation and data mining to build a high-throughput structural bioinformatics pipeline to study the effects of individual and combination SHMs on antibody conformation, flexibility, stability, and affinity. By applying this pipeline, we characterized a common mechanism of modulation of heavy-light pairing orientation by frequent SHMs at framework positions 39H, 91H, 38L, and 87L through disruption of a conserved hydrogen-bond network. Q39LH alone and in combination with light chain framework 4 (FWR4L) insertions further modulated the elbow angle between variable and constant domains of many antibodies, resulting in improved binding affinity for a subset of anti-HIV-1 antibodies. Q39LH also alleviated aggregation induced by FWR4L insertion, suggesting remote epistasis between these SHMs. Altogether, this study provides tools and insights for understanding antibody affinity maturation and for engineering functionally improved antibodies.


2021 ◽  
Author(s):  
Stephanie L Battle ◽  
Daniela Puiu ◽  
Eric Boerwinkle ◽  
Kent Taylor ◽  
Jerome Rotter ◽  
...  

Mitochondrial diseases are a heterogeneous group of disorders that can be caused by mutations in the nuclear or mitochondrial genome. Mitochondrial DNA variants may exist in a state of heteroplasmy, where a percentage of DNA molecules harbor a variant, or homoplasmy, where all DNA molecules have a variant. The relative quantity of mtDNA in a cell, or copy number (mtDNA-CN), is associated with mitochondrial function, human disease, and mortality. To facilitate accurate identification of heteroplasmy and quantify mtDNA-CN, we built a bioinformatics pipeline that takes whole genome sequencing data and outputs mitochondrial variants, and mtDNA-CN. We incorporate variant annotations to facilitate determination of variant significance. Our pipeline yields uniform coverage by remapping to a circularized chrM and recovering reads falsely mapped to nuclear-encoded mitochondrial sequences. Notably, we construct a consensus chrM sequence for each sample and recall heteroplasmy against the sample's unique mitochondrial genome. We observe an approximately 3-fold increased association with age for heteroplasmic variants in non-homopolymer regions and, are better able to capture genetic variation in the D-loop of chrM compared to existing software. Our bioinformatics pipeline more accurately captures features of mitochondrial genetics than existing pipelines that are important in understanding how mitochondrial dysfunction contributes to disease.


2021 ◽  
Vol 23 (1) ◽  
pp. 381
Author(s):  
Andrea C. Büchler ◽  
Vladimir Lazarevic ◽  
Nadia Gaïa ◽  
Myriam Girard ◽  
Friedrich Eckstein ◽  
...  

We present the case of a 72-year-old female patient with acute contained rupture of a biological composite graft, 21 months after replacement of the aortic valve and the ascending aorta due to an aortic dissection. Auramine-rhodamine staining of intraoperative biopsies showed acid-fast bacilli, but classical culture and molecular methods failed to identify any organism. Metagenomic analysis indicated infection with Mycobacterium chelonae, which was confirmed by target-specific qPCR. The complexity of the sample required a customized bioinformatics pipeline, including cleaning steps to remove sequences of human, bovine ad pig origin. Our study underlines the importance of multiple testing to increase the likelihood of pathogen identification in highly complex samples.


2021 ◽  
Author(s):  
Anton Pembaur ◽  
Erwan Sallard ◽  
Patrick Weil ◽  
Jennifer Ortelt ◽  
Parviz Ahmad-Nejad ◽  
...  

We established a protocol for fast, cost efficient Sars-CoV-2 sequencing with little as possible hands-on time (around 3h in total, excluding RNA extraction). The whole Sequencing can be done in one working day, including the bioinformatic pipeline. The cost per sample accumulates at around 40$, with already isolated RNA. We adapted and simplified existing workflows using the ‘midnight’ 1,200 bp amplicon split primer sets for PCR, which produce tiled overlapping amplicons covering almost all of the SARS-CoV-2 genome. Subsequently, we applied the Oxford Nanopore Rapid barcoding protocol and the portable MinION Mk1C sequencer in combination with the ARTIC bioinformatics pipeline. We tested the simplified and less time-consuming workflow on confirmed SARS-CoV-2-positive specimens from clinical routine and identified pre-analytical parameters, which may help to decrease the rate of sequencing failures. Duration of the complete pipeline was approx. 7 hrs for one specimen and approx. 11 hrs for 12 multiplexed barcoded specimens. This protocol is a modified version of Nikki Freed and Olin Silanders protocol. To get information such as Primers, visit their protocol. Nikki Freed, Olin Silander 2020. nCoV-2019 sequencing protocol (RAPID barcoding, 1200bp amplicon).doi: 10.1093/biomethods/bpaa014 Our peer-reviewed paper is available here: https://www.mdpi.com/2076-2607/9/12/2598


2021 ◽  
Vol 12 ◽  
Author(s):  
Javier Fernández ◽  
Manuel Fernández-Sanjurjo ◽  
Eduardo Iglesias-Gutiérrez ◽  
Pablo Martínez-Camblor ◽  
Claudio J. Villar ◽  
...  

Background: The effect of resistance training on gut microbiota composition has not been explored, despite the evidence about endurance exercise. The aim of this study was to compare the effect of resistance and endurance training on gut microbiota composition in mice.Methods: Cecal samples were collected from 26 C57BL/6N mice, divided into three groups: sedentary (CTL), endurance training on a treadmill (END), and resistance training on a vertical ladder (RES). After 2 weeks of adaption, mice were trained for 4 weeks, 5 days/week. Maximal endurance and resistance capacity test were performed before and after training. Genomic DNA was extracted and 16S Ribosomal RNA sequenced for metagenomics analysis. The percentages for each phylum, class, order, family, or genus/species were obtained using an open-source bioinformatics pipeline.Results: END showed higher diversity and evenness. Significant differences among groups in microbiota composition were only observed at genera and species level. END showed a significantly higher relative abundance of Desulfovibrio and Desulfovibrio sp., while Clostridium and C. cocleatum where higher for RES. Trained mice showed significantly lower relative abundance of Ruminococcus gnavus and higher of the genus Parabacteroides compared to CTL. We explored the relationship between relative taxa abundance and maximal endurance and resistance capacities after the training period. Lachnospiraceae and Lactobacillaceae families were negatively associated with endurance performance, while several taxa, including Prevotellaceae family, Prevotella genus, and Akkermansia muciniphila, were positively correlated. About resistance performance, Desulfovibrio sp. was negatively correlated, while Alistipes showed a positive correlation.Conclusion: Resistance and endurance training differentially modify gut microbiota composition in mice, under a high-controlled environment. Interestingly, taxa associated with anti- and proinflammatory responses presented the same pattern after both models of exercise. Furthermore, the abundance of several taxa was differently related to maximal endurance or resistance performance, most of them did not respond to training.


Sign in / Sign up

Export Citation Format

Share Document