Optimized Methods for Large-scale Shotgun DNA Sequencing in Alu-rich Genomic Regions

Author(s):  
F.J.M. IRIS
Author(s):  
Jianglin Feng ◽  
Nathan C Sheffield

Abstract Summary Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions. Availability https://github.com/databio/IGD


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Martin Johnsson ◽  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
Gregor Gorjanc ◽  
Ching-Yi Chen ◽  
...  

Abstract Background Meiotic recombination results in the exchange of genetic material between homologous chromosomes. Recombination rate varies between different parts of the genome, between individuals, and is influenced by genetics. In this paper, we assessed the genetic variation in recombination rate along the genome and between individuals in the pig using multilocus iterative peeling on 150,000 individuals across nine genotyped pedigrees. We used these data to estimate the heritability of recombination and perform a genome-wide association study of recombination in the pig. Results Our results confirmed known features of the recombination landscape of the pig genome, including differences in genetic length of chromosomes and marked sex differences. The recombination landscape was repeatable between lines, but at the same time, there were differences in average autosome-wide recombination rate between lines. The heritability of autosome-wide recombination rate was low but not zero (on average 0.07 for females and 0.05 for males). We found six genomic regions that are associated with recombination rate, among which five harbour known candidate genes involved in recombination: RNF212, SHOC1, SYCP2, MSH4 and HFM1. Conclusions Our results on the variation in recombination rate in the pig genome agree with those reported for other vertebrates, with a low but nonzero heritability, and the identification of a major quantitative trait locus for recombination rate that is homologous to that detected in several other species. This work also highlights the utility of using large-scale livestock data to understand biological processes.


2006 ◽  
Vol 04 (03) ◽  
pp. 639-647 ◽  
Author(s):  
ELEAZAR ESKIN ◽  
RODED SHARAN ◽  
ERAN HALPERIN

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .


2020 ◽  
Author(s):  
Michael W J Hall ◽  
David Shorthouse ◽  
Philip H Jones ◽  
Benjamin A Hall

AbstractThe recent development of highly sensitive DNA sequencing techniques has detected large numbers of missense mutations of genes, including NOTCH1 and 2, in ageing normal tissues. Driver mutations persist and propagate in the tissue through a selective advantage over both wild-type cells and alternative mutations. This process of selection can be considered as a large scale, in vivo screen for mutations that increase clone fitness. It follows that the specific missense mutations that are observed in individual genes may offer us insights into the structure-function relationships. Here we show that the positively selected missense mutations in NOTCH1 and NOTCH2 in human oesophageal epithelium cause inactivation predominantly through protein misfolding. Once these mutations are excluded, we further find statistically significant evidence for selection at the ligand binding interface and calcium binding sites. In this, we observe stronger evidence of selection at the ligand interface on EGF12 over EGF11, suggesting that in this tissue EGF12 may play a more important role in ligand interaction. Finally, we show how a mutation hotspot in the NOTCH1 transmembrane helix arises through the intersection of both a high mutation rate and residue conservation. Together these insights offer a route to understanding the mechanism of protein function through in vivo mutant selection.


2017 ◽  
Vol 10 (1) ◽  
Author(s):  
Kolapo M. Oyebola ◽  
Emmanuel T. Idowu ◽  
Yetunde A. Olukosi ◽  
Taiwo S. Awolola ◽  
Alfred Amambua-Ngwa

Sign in / Sign up

Export Citation Format

Share Document