scholarly journals Triplet repeats in human genome: distribution and their association with genes and other genomic regions

2003 ◽  
Vol 19 (5) ◽  
pp. 549-552 ◽  
Author(s):  
S. Subramanian ◽  
V. M. Madgula ◽  
R. George ◽  
R. K. Mishra ◽  
M. W. Pandit ◽  
...  
Genetics ◽  
1999 ◽  
Vol 152 (4) ◽  
pp. 1711-1722 ◽  
Author(s):  
Gavin A Huttley ◽  
Michael W Smith ◽  
Mary Carrington ◽  
Stephen J O’Brien

Abstract Linkage disequilibrium (LD), the tendency for alleles of linked loci to co-occur nonrandomly on chromosomal haplotypes, is an increasingly useful phenomenon for (1) revealing historic perturbation of populations including founder effects, admixture, or incomplete selective sweeps; (2) estimating elapsed time since such events based on time-dependent decay of LD; and (3) disease and phenotype mapping, particularly for traits not amenable to traditional pedigree analysis. Because few descriptions of LD for most regions of the human genome exist, we searched the human genome for the amount and extent of LD among 5048 autosomal short tandem repeat polymorphism (STRP) loci ascertained as specific haplotypes in the European CEPH mapping families. Evidence is presented indicating that ∼4% of STRP loci separated by <4.0 cM are in LD. The fraction of locus pairs within these intervals that display small Fisher’s exact test (FET) probabilities is directly proportional to the inverse of recombination distance between them (1/cM). The distribution of LD is nonuniform on a chromosomal scale and in a marker density-independent fashion, with chromosomes 2, 15, and 18 being significantly different from the genome average. Furthermore, a stepwise (locus-by-locus) 5-cM sliding-window analysis across 22 autosomes revealed nine genomic regions (2.2-6.4 cM), where the frequency of small FET probabilities among loci was greater than or equal to that presented by the HLA on chromosome 6, a region known to have extensive LD. Although the spatial heterogeneity of LD we detect in Europeans is consistent with the operation of natural selection, absence of a formal test for such genomic scale data prevents eliminating neutral processes as the evolutionary origin of the LD.


2021 ◽  
Author(s):  
Catarina Silva ◽  
Miguel Machado ◽  
José Ferrão ◽  
Sebastião Rodrigues ◽  
Luís Vieira

DNA methylation is a type of epigenetic modification that affects gene expression regulation and is associated with several human diseases. Microarray and short read sequencing technologies are often used to study 5'-methylcytosine (5'-mC) modification of CpG dinucleotides in the human genome. Although both technologies produce trustable results, the evaluation of the methylation status of CpG sites suffers from the potential side effects of DNA modification by bisulfite and the ambiguity of mapping short reads in repetitive and highly homologous genomic regions, respectively. Nanopore sequencing is an attractive alternative for the study of 5'-mC since the long reads produced by this technology allow to resolve those genomic regions more easily. Moreover, it allows direct sequencing of native DNA molecules using a fast library preparation procedure. In this work we show that 10X coverage depth nanopore sequencing, using DNA from a human cell line, produces 5'-mC methylation frequencies consistent with those obtained by methylation microarray and digital restriction enzyme analysis of methylation. In particular, the correlation of methylation values ranged from 0.73 to 0.90 using an average genome sequencing coverage depth <2X or a minimum read support of 17X for each CpG site, respectively. We also showed that a minimum of 5 reads per CpG yields strong correlations (>0.89) between sequencing runs and an almost uniform variation in methylation frequencies of CpGs across the entire value range. Furthermore, nanopore sequencing was able to correctly display methylation frequency patterns according to genomic annotations, including a majority of unmethylated and methylated sites in the CpG islands and inter-CpG island regions, respectively. These results demonstrate that low coverage depth nanopore sequencing is a fast, reliable and unbiased approach to the study of 5'-mC in the human genome.


2018 ◽  
Vol 47 (4) ◽  
pp. e21-e21 ◽  
Author(s):  
Marius Gheorghe ◽  
Geir Kjetil Sandve ◽  
Aziz Khan ◽  
Jeanne Chèneby ◽  
Benoit Ballester ◽  
...  

Abstract Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the most popular assay to identify genomic regions, called ChIP-seq peaks, that are bound in vivo by transcription factors (TFs). These regions are derived from direct TF–DNA interactions, indirect binding of the TF to the DNA (through a co-binding partner), nonspecific binding to the DNA, and noise/bias/artifacts. Delineating the bona fide direct TF–DNA interactions within the ChIP-seq peaks remains challenging. We developed a dedicated software, ChIP-eat, that combines computational TF binding models and ChIP-seq peaks to automatically predict direct TF–DNA interactions. Our work culminated with predicted interactions covering >2% of the human genome, obtained by uniformly processing 1983 ChIP-seq peak data sets from the ReMap database for 232 unique TFs. The predictions were a posteriori assessed using protein binding microarray and ChIP-exo data, and were predominantly found in high quality ChIP-seq peaks. The set of predicted direct TF–DNA interactions suggested that high-occupancy target regions are likely not derived from direct binding of the TFs to the DNA. Our predictions derived co-binding TFs supported by protein-protein interaction data and defined cis-regulatory modules enriched for disease- and trait-associated SNPs. We provide this collection of direct TF–DNA interactions and cis-regulatory modules through the UniBind web-interface (http://unibind.uio.no).


Genomics ◽  
2015 ◽  
Vol 106 (2) ◽  
pp. 88-95 ◽  
Author(s):  
Hongyu Zhao ◽  
Yongqiang Xing ◽  
Guoqing Liu ◽  
Ping Chen ◽  
Xiujuan Zhao ◽  
...  
Keyword(s):  

iScience ◽  
2021 ◽  
Vol 24 (2) ◽  
pp. 102048
Author(s):  
Leonidas P. Karakatsanis ◽  
Evgenios G. Pavlos ◽  
George Tsoulouhas ◽  
Georgios L. Stamokostas ◽  
Timothy Mosbruger ◽  
...  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Majid Mehravar ◽  
Fatemeh Ghaemimanesh ◽  
Ensieh M. Poursani

Abstract Background Overlapping genes share same genomic regions in parallel (sense) or anti-parallel (anti-sense) orientations. These gene pairs seem to occur in all domains of life and are best known from viruses. However, the advantage and biological significance of overlapping genes is still unclear. Expressed sequence tags (ESTs) analysis enabled us to uncover an overlapping gene pair in the human genome. Results By using in silico analysis of previous experimental documentations, we reveal a new form of overlapping genes in the human genome, in which two genes found on opposite strands (Pou5f1 and Tcf19), share two exons and one intron enclosed, at the same positions, between OCT4B3 and TCF19-D splice variants. Conclusions This new form of overlapping gene expands our previous perception of splicing events and may shed more light on the complexity of gene regulation in higher organisms. Additional such genes might be detected by ESTs analysis also of other organisms.


2016 ◽  
Author(s):  
Musaddeque Ahmed ◽  
Richard C. Sallari ◽  
Haiyang Guo ◽  
Jason H. Moore ◽  
Housheng Hansen He ◽  
...  

AbstractSummaryGenetic predispositions to diseases populate the noncoding regions of the human genome. Delineating their functional basis can inform on the mechanisms contributing to disease development. However, this remains a challenge due to the poor characterization of the noncoding genome. Variant Set Enrichment (VSE) is a fast method to calculate the enrichment of a set of disease-associated variants across functionally annotated genomic regions, consequently highlighting the mechanisms important in the etiology of the disease studied.Availability and ImplementationVSE is implemented as an R package and can easily be implemented in any system with R. See supplementary information for [email protected]; [email protected]


2021 ◽  
Author(s):  
Moataz Dowaidar

The human genome has various genomic regions that can create a large number of transcripts. RNAs that can function as both mRNA and noncoding RNA (lncRNA/snoRNA/miRNA) are known as bifunctional RNAs, or bifRNAs. BifRNAs have been detected in everything from microorganisms to humans. Cells may accurately modify the functions of the coding and noncoding regions of bifRNAs to satisfy relevant regulatory needs. However, it has not been thoroughly investigated whether the same gene locus may produce two types of functional nc transcripts, such as lncRNAs and miRNAs. These "bifunctional nc RNAs" are the topic of this review. This evaluation contained all the current information regarding LncMIRHGs. Some LINC MONOMER transcripts have not been proven to be "junk" according to this functional and mechanistic research. It is possible that the lncMIRHG locus makes both functional miRNAs and lncRNAs, some of which can act together and others of which may act independently. The data gathered via research by the NEAT1 organization also indicates that miRNA may function as a "pseudoRNA," with lncRNA produced from the lncMIRHG gene locus serving as the lead. A significant amount of focus on this class of lncRNAs must be given since the beauty of the lncMIRHG loci, which control these putative dual functions as lncRNA and miRNA, strongly recommends that we should do so. LincMIRHGs are utilized in a broad number of tasks, including those seen in disorders like cancer. It will be useful for medicine creation and development to have a full understanding of this lncRNA repertoire's mechanisms.


2019 ◽  
Vol 12 (2) ◽  
pp. 3762-3777 ◽  
Author(s):  
Henrik Devitt Møller ◽  
Jazmín Ramos-Madrigal ◽  
Iñigo Prada-Luengo ◽  
M Thomas P Gilbert ◽  
Birgitte Regenberg

Abstract Extrachromosomal circular DNA (eccDNA) elements of chromosomal origin are known to be common in a number of eukaryotic species. However, it remains to be addressed whether genomic features such as genome size, the load of repetitive elements within a genome, and/or animal physiology affect the number of eccDNAs. Here, we investigate the distribution and numbers of eccDNAs in a condensed and less repeat-rich genome compared with the human genome, using Columba livia domestica (domestic rock pigeon) as a model organism. By sequencing eccDNA in blood and breast muscle from three pigeon breeds at various ages and with different flight behavior, we characterize 30,000 unique eccDNAs. We identify genomic regions that are likely hotspots for DNA circularization in breast muscle, including genes involved in muscle development. We find that although eccDNA counts do not correlate with the biological age in pigeons, the number of unique eccDNAs in a nonflying breed (king pigeons) is significantly higher (9-fold) than homing pigeons. Furthermore, a comparison between eccDNA from skeletal muscle in pigeons and humans reveals ∼9-10 times more unique eccDNAs per human nucleus. The fraction of eccDNA sequences, derived from repetitive elements, exist in proportions to genome content, that is, human 72.4% (expected 52.5%) and pigeon 8.7% (expected 5.5%). Overall, our results support that eccDNAs are common in pigeons, that the amount of unique eccDNA types per nucleus can differ between species as well as subspecies, and suggest that eccDNAs from repeats are found in proportions relative to the content of repetitive elements in a genome.


Sign in / Sign up

Export Citation Format

Share Document