scholarly journals Chanjo: Clincal grade sequence coverage analysis

F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 615
Author(s):  
Robin Andeer ◽  
Måns Magnusson ◽  
Anna Wedell ◽  
Henrik Stranneheim

Coverage analysis is essential when analysing massive parallel sequencing (MPS) data. The analysis indicates existence of false negatives or positives in a region of interest or poorly covered genomic regions. There are several tools that have excellent performance when doing coverage analysis on a few samples with predefined regions. However, there is no current tool for collecting samples over a longer period of time for aggregated coverage analysis of multiple samples or sequencing methods. Furthermore, current coverage analysis tools do not generate customized coverage reports or enable exploratory coverage analysis without extensive bioinformatic skill and access to the original alignment files. We present Chanjo, a user friendly coverage analysis tool for persistent storage of coverage data, that, accompanied with Chanjo Report, produces coverage reports that summarize coverage data for predefined regions in an elegant manner. Chanjo Report can produce both structured coverage reports and dynamic reports tailored to a subset of genomic regions, coverage cut-offs or samples. Chanjo stores data in an SQL database where thousands of samples can be added over time, which allows for aggregate queries to discover problematic regions. Chanjo is well tested, supports whole exome and genome sequencing, and follows common UNIX standards, allowing for easy integration into existing pipelines. Chanjo is easy to install and operate, and provides a solution for persistent coverage analysis and clinical-grade reporting. It makes it easy to set up a local database and automate the addition of multiple samples and report generation. To our knowledge there is no other tool with matching capabilities. Chanjo handles the common file formats in genetics, such as BED and BAM, and makes it easy to produce PDF coverage reports that are highly valuable for individuals with limited bioinformatic expertise. We believe Chanjo to be a vital tool for clinicians and researchers performing MPS analysis.

Plant Methods ◽  
2021 ◽  
Vol 17 (1) ◽  
Author(s):  
Peio Ziarsolo ◽  
Tomas Hasing ◽  
Rebeca Hilario ◽  
Victor Garcia-Carpintero ◽  
Jose Blanca ◽  
...  

Abstract Background K-seq, a new genotyping methodology based on the amplification of genomic regions using two steps of Klenow amplification with short oligonucleotides, followed by standard PCR and Illumina sequencing, is presented. The protocol was accompanied by software developed to aid with primer set design. Results As the first examples, K-seq in species as diverse as tomato, dog and wheat was developed. K-seq provided genetic distances similar to those based on WGS in dogs. Experiments comparing K-seq and GBS in tomato showed similar genetic results, although K-seq had the advantage of finding more SNPs for the same number of Illumina reads. The technology reproducibility was tested with two independent runs of the tomato samples, and the correlation coefficient of the SNP coverages between samples was 0.8 and the genotype match was above 94%. K-seq also proved to be useful in polyploid species. The wheat samples generated specific markers for all subgenomes, and the SNPs generated from the diploid ancestors were located in the expected subgenome with accuracies greater than 80%. Conclusion K-seq is an open, patent-unencumbered, easy-to-set-up, cost-effective and reliable technology ready to be used by any molecular biology laboratory without special equipment in many genetic studies.


Genetics ◽  
2003 ◽  
Vol 165 (4) ◽  
pp. 2213-2233 ◽  
Author(s):  
Na Li ◽  
Matthew Stephens

AbstractWe introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.


2010 ◽  
Vol 25 (3) ◽  
pp. 205-211 ◽  
Author(s):  
Hamed Panjeh ◽  
Hashem Hakimabad ◽  
Lalle Motavalli

The gamma ray spectrum resolution from a 241Am-Be source-based prompt gamma ray activation analysis set-up has been observed to increase in the energy region of interest with enclosing the NaI detector in a proper neutron and gamma ray shield. We have investigated the tact that the peak resolution of prompt gamma rays in the region of interest from the set-up depends on the source activity to the great extent, size and kind of the detector and the geometry of the detector shield. In order to see the role of a detector shield, five kinds of the detector shield were used and finally the proper kind was introduced. Since the detector shield has an important contribution in the reduction of the undesirable and high rate gamma rays coming to the gamma ray detector, a good design of a proper shield enables the elimination of the unwanted events, such as a pulse pile-up. By improving the shielding design, discrete and distinguishable photoelectric peaks in the energy region of interest have been observed in the spectrum of prompt gamma rays.


2014 ◽  
Vol 11 ◽  
pp. 625-633 ◽  
Author(s):  
Domenico Enrico Massimo ◽  
Mariangela Musolino ◽  
Antonino Barbalace ◽  
Cinzia Fragomeni

Pollution, environmental disruption, oversized urban development and infrastructure new construction jeopardize landscape integrity and peoples quality of life. Research deals with the landscape protection and enhancement providing governments and decision makers with a comprehensive Decision Support System to assess the quality of natural and cultural heritage and address planning measures and policy actions for landscape treasuring. Research set-up a sound methodology relying upon GIS tools, to spatially detect and define landscape units along with their endowment such as natural, ecological, historic, cultural, and urban resources then valuated with a GIS integrated multi criteria analysis tool set-up by the research team. Research developed a Case Study in the European Mediterranean Basin, validating the whole system and the performance and support of the GIS tools. Results achieved open the possibility to generalize the prototype application at regional, country and federation levels and therefore support the planning implementation for landscape enhancement.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 2110-2110
Author(s):  
Stephanie Laufs ◽  
Frank Giordano ◽  
Daniel Lauterborn ◽  
K. Zsuzsanna Nagy ◽  
Kurt Fellernberg ◽  
...  

Abstract Increasing use of hematopoietic stem cells for retroviral vector-mediated gene therapy and recent reports on leukemogenesis in mice and humans have created intense interest to characterize vector integrations on the genomic level. As techniques to determine insertion sites are more commonly applied in gene therapy laboratories there is a need to systematically collect and analyze the data arising from such studies in a vector insertion database. This will allow determining factors responsible for preferential integration of various vector types in specific chromosomal regions, genes or gene sections. The information derived from a vector insertion data base will be useful to recognize more “dangerous” vector types and may provide useful information for vector design. We have set up an automatic sequence analysis tool (ensuring quality criteria e.g. verification of LTR- and adapter sequence, score >40, e-value >10e-40, hit RefSeq, next RefSeq etc.) which simplifies data input enormously while ensuring high quality standards. Our group is establishing the "collaborative RISC (retroviral insertion estimation into chromosome) -Score Database (CRSD)"- assessment project, based on the M-CHIPS (Multi-Conditional Hybridisation Intensity Processing System) microarray data warehouse and analysis software (K. Fellenberg et al. 2001, 2002). The data obtained from the sequence analysis tool were automatically fed in the data base. A total of 287 retroviral vector integration sites were isolated and sequence analysis was performed with the above describe analysis tool. In human bone marrow repopulating cells they occurred with significantly increased frequency into chromosomes 17 and 19 (n=189). Analysis of targeted RefSeq genes showed a favored integration (48%) within the first intron. In comparison, retroviral vector integrations in T-cells (n=98) showed an entirely different chromosomal distribution pattern while the percentage of the targeted RefSeq genes was similar (46%). Further, more than 1200 sequences were submitted to the data base, originating from different vectors (SF-MDR-, MoLV-based TK/neoR-Mo3TIN-, Moloney-MGMT-, Harvey-based Neo-, Harvey-based MDR-, and lentiviral GFP-SIN-vectors) and different transduced cells (mouse hematopoietic cells, mouse fibroblasts, rhesus hematopoietic cells, human hematopoietic cells, human T-cells). The set-up and internal structure of the data base will be presented. Collaborations have been forged to include further groups and vector types. Bioinformatical analysis will allow recognizing even complex vector integration patterns and will broaden our understanding for the determinants of vector integration into the genome. This in turn can lead to the construction of "favorable" vectors and help to reduce the genotoxicity of retroviral or lentiviral vector-mediated gene transfer.


2021 ◽  
Author(s):  
Karsten Rink ◽  
Özgür Ozan Şen ◽  
Malte Schwanebeck ◽  
Tim Hartmann ◽  
Firdovsi Gasanzade ◽  
...  

Abstract The transition to renewable energy sources requires extensive changes to the energy system infrastructure ranging from individual households to the national scale. During this transition, stakeholders must be able to make informed decisions, researchers need to investigate possible options and analyse scenarios and the public should be informed about developments and options for future infrastructure. The data and parameters required for this are manifold and it is often difficult to create an overview of the current situation for a region of interest. We propose an environmental information system for the visualisation and exploration of large collections of heterogeneous data in the scope of energy system infrastructure and subsurface geological energy storage technologies. Based on the study area of Schleswig-Holstein, a federal state in Germany, we have set up a virtual geographic environment integrating GIS data, topographical models, subsurface information, and simulation results. The resulting application allows users to explore data collection within a unified context in 3D space, interact with datasets, and watch animation of selected simulation scenarios to gain a better understanding of the complex interactions of processes and datasets. Based on the cross-platform game engine Unity, our framework can be used on regular PCs, head-mounted displays, and virtual reality environments and can support domain scientists during assessment and exploration of the data, encourages discussions and is an effective means for outreach activities and presentations for stakeholders or the interested public.


Author(s):  
Hao Gao ◽  
Qingting Zhao ◽  
Chuanlin Ning ◽  
Difan Guo ◽  
Jing Wu ◽  
...  

In July 2021, breakthrough cases were reported in the outbreak of COVID-19 in Nanjing, sparking concern and discussion about the vaccine’s effectiveness and becoming a trending topic on Sina Weibo. In order to explore public attitudes towards the COVID-19 vaccine and their emotional orientations, we collected 1542 posts under the trending topic through data mining. We set up four categories of attitudes towards COVID-19 vaccines, and used a big data analysis tool to code and manually checked the coding results to complete the content analysis. The results showed that 45.14% of the Weibo posts (n = 1542) supported the COVID-19 vaccine, 12.97% were neutral, and 7.26% were doubtful, which indicated that the public did not question the vaccine’s effectiveness due to the breakthrough cases in Nanjing. There were 66.47% posts that reflected significant negative emotions. Among these, 50.44% of posts with negative emotions were directed towards the media, 25.07% towards the posting users, and 11.51% towards the public, which indicated that the negative emotions were not directed towards the COVID-19 vaccine. External sources outside the vaccine might cause vaccine hesitancy. Public opinions expressed in online media reflect the public’s cognition and attitude towards vaccines and their core needs in terms of information. Therefore, online public opinion monitoring could be an essential way to understand the opinions and attitudes towards public health issues.


2014 ◽  
Author(s):  
Yarden Katz ◽  
Eric T Wang ◽  
Jacob Stilterra ◽  
Schraga Schwartz ◽  
Bang Wong ◽  
...  

Analysis of RNA sequencing (RNA-Seq) data revealed that the vast majority of human genes express multiple mRNA isoforms, produced by alternative pre-mRNA splicing and other mechanisms, and that most alternative isoforms vary in expression between human tissues. As RNA-Seq datasets grow in size, it remains challenging to visualize isoform expression across multiple samples. We present Sashimi plots, a quantitative multi-sample visualization of RNA-Seq reads aligned to gene annotations, which enables quantitative comparison of isoform usage across samples or experimental conditions. Given an input annotation and spliced alignments of reads from a sample, a region of interest is visualized in a Sashimi plot as follows: (i) alignments in exons are represented as read densities (optionally normalized by length of genomic region and coverage), and (ii) splice junction reads are drawn as arcs connecting a pair of exons, where arc width is drawn proportional to the number of reads aligning to the junction.


2018 ◽  
Author(s):  
Angela M. Early ◽  
Rachel F. Daniels ◽  
Timothy M. Farrell ◽  
Jonna Grimsby ◽  
Sarah K. Volkman ◽  
...  

AbstractBackgroundDeep sequencing of targeted genomic regions is becoming a common tool for understanding the dynamics and complexity of Plasmodium infections, but its lower limit of detection is currently unknown. Here, a new amplicon analysis tool, the Parallel Amplicon Sequencing Error Correction (PASEC) pipeline, is used to evaluate the performance of amplicon sequencing on low-density Plasmodium DNA samples. Illumina-based sequencing of two P. falciparum genomic regions (CSP and SERA2) was performed on two types of samples: in vitro DNA mixtures mimicking low-density infections (1-200 genomes/μl) and extracted blood spots from a combination of symptomatic and asymptomatic individuals (44-653,080 parasites/μl). Three additional analysis tools—DADA2, HaplotypR, and SeekDeep—were applied to both datasets and the precision and sensitivity of each tool were evaluated.ResultsAmplicon sequencing can contend with low-density samples, showing reasonable detection accuracy down to a concentration of 5 Plasmodium genomes/μl. Due to increased stochasticity and background noise, however, all four tools showed reduced sensitivity and precision on samples with very low parasitemia (<5 copies/μl) or low read count (<100 reads per amplicon). PASEC could distinguish major from minor haplotypes with an accuracy of 90% in samples with at least 30 Plasmodium genomes/μl, but only 61% at low Plasmodium concentrations (<5 genomes/μl) and 46% at very low read counts (<25 reads per amplicon). The four tools were additionally used on a panel of extracted parasite-positive blood spots from natural malaria infections. While all four identified concordant patterns of complexity of infection (COI) across four sub-Saharan African countries, the COI values obtained for individual samples differed in some cases.ConclusionsAmplicon deep sequencing can be used to determine the complexity and diversity of low-density Plasmodium infections. Despite differences in their approach, four state-of-the-art tools resolved known haplotype mixtures with similar sensitivity and precision. Researchers can therefore choose from multiple robust approaches for analyzing amplicon data, however, error filtration approaches should not be uniformly applied across samples of varying parasitemia. Samples with very low parasitemia and very low read count have higher false positive rates and call for read count thresholds that are higher than current recommendations.


Sign in / Sign up

Export Citation Format

Share Document