scholarly journals Vcflib and tools for processing the VCF variant call format

2021 ◽  
Author(s):  
Erik Garrison ◽  
Zev N Kronenberg ◽  
Eric T Dawson ◽  
Brent S Pedersen ◽  
Pjotr Prins

Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies --- as well as in somatic and germline mutation studies. VCF can present single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called against a reference genome. Here we present over 125 useful and much used free and open source software tools and libraries, part of vcflib tools and bio-vcf. We also highlight cyvcf2, hts-nim and slivar tools. Application is typically in the comparison, filtering, normalisation, smoothing, annotation, statistics, visualisation and exporting of variants. Our tools run daily and invisibly in pipelines and countless shell scripts. Our tools are part of a wider bioinformatics ecosystem and we consider it very important to make these tools available as free and open source software to all bioinformaticians so they can be deployed through software distributions, such as Debian, GNU Guix and Bioconda. vcflib, for example, was installed over 40,000 times and bio-vcf was installed over 15,000 times through Bioconda by December 2020. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation that can not easily be represented by the VCF format. All source code is published under free and open source software licenses and can be downloaded and installed from https://github.com/vcflib.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gavin W. Wilson ◽  
Mathieu Derouet ◽  
Gail E. Darling ◽  
Jonathan C. Yeung

AbstractIdentifying single nucleotide variants has become common practice for droplet-based single-cell RNA-seq experiments; however, presently, a pipeline does not exist to maximize variant calling accuracy. Furthermore, molecular duplicates generated in these experiments have not been utilized to optimally detect variant co-expression. Herein, we introduce scSNV designed from the ground up to “collapse” molecular duplicates and accurately identify variants and their co-expression. We demonstrate that scSNV is fast, with a reduced false-positive variant call rate, and enables the co-detection of genetic variants and A>G RNA edits across twenty-two samples.


2018 ◽  
Author(s):  
Maxime Garcia ◽  
Szilveszter Juhos ◽  
Malin Larsson ◽  
Pall I. Olason ◽  
Marcel Martin ◽  
...  

AbstractSummaryWhole-genome sequencing (WGS) is a cornerstone of precision medicine, but portable and reproducible open-source workflows for WGS analyses of germline and somatic variants are lacking. We present Sarek, a modular, comprehensive, and easy-to-install workflow, combining a range of software for the identification and annotation of single-nucleotide variants (SNVs), insertion and deletion variants (indels), structural variants, tumor sample heterogeneity, and karyotyping from germline or paired tumor/normal samples. Sarek is implemented in a bioinformatics workflow language (Nextflow) with Docker and Singularity compatible containers, ensuring easy deployment and full reproducibility at any Linux based compute cluster or cloud computing environment. Sarek supports the human reference genomes GRCh37 and GRCh38, and can readily be used both as a core production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups.AvailabilitySource code and instructions for local installation are available at GitHub (https://github.com/SciLifeLab/Sarek) under the MIT open-source license, and we invite the research community to contribute additional functionality as a collaborative open-source development project.


Author(s):  
Kwei-Jay Lin ◽  
Yi-Hsuan Lin ◽  
Tung-Mei Ko

In this chapter, the authors present a novel perspective by using the Creative Commons (CC) licensing model to compare 10 commonly used OSS licenses. The authors also propose a license compatibility table to show that whether it is possible to combine OSS with CC-licensed open content in a creative work. By using the CC licensing concept to interpret OSS licenses, the authors hope that users can get a deeper understanding on the ideas and issues behind many of the OSS licenses. In addition, the authors hope that by means of this table, users can make a better decision on the license selection while combining open source with CC-licensed works.


2009 ◽  
pp. 2978-2990
Author(s):  
Kwei-Jay Lin ◽  
Yi-Hsuan Lin ◽  
Tung-Mei Ko

In this chapter, the authors present a novel perspective by using the Creative Commons (CC) licensing model to compare 10 commonly used OSS licenses. The authors also propose a license compatibility table to show that whether it is possible to combine OSS with CC-licensed open content in a creative work. By using the CC licensing concept to interpret OSS licenses, the authors hope that users can get a deeper understanding on the ideas and issues behind many of the OSS licenses. In addition, the authors hope that by means of this table, users can make a better decision on the license selection while combining open source with CC-licensed works.


Sign in / Sign up

Export Citation Format

Share Document