Will Gene Patents Impede Whole Genome Sequencing?: Deconstructing the Myth that Twenty Percent of the Human Genome is Patented

Author(s):  
Christopher M. Holman
2018 ◽  
Author(s):  
Xun Chen ◽  
Dawei Li

AbstractMotivationApproximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using next-generation sequencing (NGS) data.ResultsWe designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark whole-genome sequencing (WGS) datasets. By comparing with existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. PCR and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species.Availabilitywww.uvm.edu/genomics/software/[email protected] informationSupplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (20) ◽  
pp. 3913-3922 ◽  
Author(s):  
Xun Chen ◽  
Dawei Li

Abstract Motivation Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. Results We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species. Availability and implementation http://www.uvm.edu/genomics/software/ERVcaller.html. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 66 (1) ◽  
pp. 39-52
Author(s):  
Tomoya Tanjo ◽  
Yosuke Kawai ◽  
Katsushi Tokunaga ◽  
Osamu Ogasawara ◽  
Masao Nagasaki

AbstractStudies in human genetics deal with a plethora of human genome sequencing data that are generated from specimens as well as available on public domains. With the development of various bioinformatics applications, maintaining the productivity of research, managing human genome data, and analyzing downstream data is essential. This review aims to guide struggling researchers to process and analyze these large-scale genomic data to extract relevant information for improved downstream analyses. Here, we discuss worldwide human genome projects that could be integrated into any data for improved analysis. Obtaining human whole-genome sequencing data from both data stores and processes is costly; therefore, we focus on the development of data format and software that manipulate whole-genome sequencing. Once the sequencing is complete and its format and data processing tools are selected, a computational platform is required. For the platform, we describe a multi-cloud strategy that balances between cost, performance, and customizability. A good quality published research relies on data reproducibility to ensure quality results, reusability for applications to other datasets, as well as scalability for the future increase of datasets. To solve these, we describe several key technologies developed in computer science, including workflow engine. We also discuss the ethical guidelines inevitable for human genomic data analysis that differ from model organisms. Finally, the future ideal perspective of data processing and analysis is summarized.


Author(s):  
Prisca K. Thami ◽  
Wonderful T. Choga ◽  
Delesa Damena Mulisa ◽  
Collet Dandara ◽  
Andrey K. Shevchenko ◽  
...  

The study of human genome variations can contribute towards understanding population diversity and the genetic aetiology of health-related traits. We sought to characterise human genomic variations of Botswana in order to assess diversity and elucidate mutation burden in the population using whole genome sequencing. Whole genome sequences of 390 unrelated individuals from Botswana were available for computational analysis. The sequences were mapped to the human reference genome GRCh38. Population joint variant calling was performed using Genome Analysis Tool Kit (GATK) and BCFTools. Variant characterisation was achieved by annotating the variants with a suite of databases in ANNOVAR and snpEFF. The genomic architecture of Botswana was delineated through principal component analysis, structure analysis and FST. We identified a total of 27.7 million unique variants. Variant prioritisation revealed 24 damaging variants with the most damaging variants being ACTRT2 rs3795263, HOXD12 rs200302685, ABCB5 rs111647033, ATP8B4 rs77004004 and ABCC12 rs113496237. We observed admixture of the Khoe-San, Niger-Congo and European ancestries in the population of Botswana, however population substructure was not observed. This exploration of whole genome sequences presents a comprehensive characterisation of human genomic variations in the population of Botswana and their potential in contributing to a deeper understanding of population diversity and health in Africa and the African diaspora.


2020 ◽  
Vol 36 (12) ◽  
pp. 3877-3878
Author(s):  
Mark Grivainis ◽  
Zuojian Tang ◽  
David Fenyö

Abstract Motivation Retrotransposition is an important force in shaping the human genome and is involved in prenatal development, disease and aging. Current genome browsers are not optimized for visualizing the experimental evidence for retrotransposon insertions. Results We have developed a specialized browser to visualize the evidence for retrotransposon insertions for both targeted and whole-genome sequencing data. Availability and implementation TranspoScope’s source code, as well as installation instructions, are available at https://github.com/FenyoLab/transposcope.


2018 ◽  
Author(s):  
Mark Stevenson ◽  
Alistair T Pagnamenta ◽  
Heather G Mack ◽  
Judith A Savige ◽  
Kate E Lines ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document