scholarly journals Probing the dark matter of the human genome with big DNA

2019 ◽  
Vol 41 (3) ◽  
pp. 46-48
Author(s):  
Jon M. Laurent ◽  
Sudarshan Pinglay ◽  
Leslie Mitchell ◽  
Ran Brosh

Less than 2% of our genome is protein-coding DNA. The vast expanses of non-coding DNA make up the genome's “dark matter”, where introns, repetitive and regulatory elements reside. Variation between individuals in non-coding regulatory DNA is emerging as a major factor in the genetics of numerous diseases and traits, yet very little is known about how such variations contribute to disease risk. Studying the genetics of regulatory variation is technically challenging as regulatory elements can affect genes located tens of thousands of base pairs away, and often, multiple distal regulatory variations, each with a very small effect, combine in an unknown way to significantly modulate the expression of genes. At the Center for Synthetic Regulatory Genomics (SyRGe) we directly tackle these problems in order to systematically elucidate the mechanisms of regulatory variation underlying human disease.

2021 ◽  
Author(s):  
Naoto Kubota ◽  
Mikita Suyama

AbstractGenome-wide association studies (GWAS) have been performed to identify thousands of variants in the human genome as disease risk markers, but functional variants that actually affect gene regulation and their genomic features remain largely unknown. Here we performed a comprehensive survey of functional variants in the regulatory elements of the human genome. We integrated hematopoietic transcription factor (TF) footprints datasets generated by ENCODE project with multiple quantitative trait locus (QTL) datasets (eQTL, caQTL, bQTL, and hQTL) and investigated the associations of functional variants and immune system disease risk. We identified candidate regulatory variants highly linked with GWAS lead variants and found that they were strongly enriched in active enhancers in hematopoietic cells, emphasizing the clinical relevance of enhancers in disease risk. Moreover, we found some strong relationships between traits and hematopoietic cell types or TFs. We highlighted some credible regulatory variants and found that a variant, rs2291668, which potentially functions in the molecular pathogenesis of multiple sclerosis, is located within a TF footprint present in a protein-coding exon of the TNFSF14 gene, indicating that protein-coding exons as well as noncoding regions can possess clinically relevant regulatory elements. Collectively, our results shed light on the molecular pathogenesis of immune system diseases. The methods described in this study can readily be applied to the study of the risk factors of other diseases.


2021 ◽  
Author(s):  
Thanh Thanh Le Nguyen ◽  
Huanyao Gao ◽  
Duan Liu ◽  
Zhenqing Ye ◽  
Jeong-Heon Lee ◽  
...  

AbstractUnderstanding the function of non-coding genetic variants represents a formidable challenge for biomedicine. We previously identified genetic variants that influence gene expression only after exposure to a hormone or drug. Using glucocorticoid signaling as a model system, we have now demonstrated, in a genome-wide manner, that exposure to glucocorticoids triggered disease risk variants with previously unclear function to influence the expression of genes involved in autoimmunity, metabolic and mood disorders, osteoporosis and cancer. Integrating a series of genomic and epigenomic assays, we identified the cis-regulatory elements and 3-dimensional interactions underlying the ligand-dependent associations between those genetic variants and distant risk genes. These observations increase our understanding of mechanisms of non-coding genetic variant-chemical environment interactions and advance the fine-mapping of disease risk and pharmacogenomic loci.One Sentence SummaryGenomic and epigenomic fine-mapping of ligand-dependent genetic variants unmasks novel disease risk genes


2019 ◽  
Author(s):  
Wouter Meuleman ◽  
Alexander Muratov ◽  
Eric Rynes ◽  
Jessica Halow ◽  
Kristen Lee ◽  
...  

AbstractDNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA and harbor disease- and phenotypic trait-associated genetic variation. We established high-precision maps of DNase I hypersensitive sites from 733 human biosamples encompassing 439 cell and tissue types and states, and integrated these to precisely delineate and numerically index ~3.6 million DHSs encoded within the human genome, providing a common coordinate system for regulatory DNA. Here we show that the expansive scale of cell and tissue states sampled exposes an unprecedented degree of stereotyped actuation of large sets of elements, signaling the operation of distinct genome-scale regulatory programs. We show further that the complex actuation patterns of individual elements can be captured comprehensively by a simple regulatory vocabulary reflecting their dominant cellular manifestation. This vocabulary, in turn, enables comprehensive and quantitative regulatory annotation of both protein-coding genes and the vast array of well-defined but poorly-characterized non-coding RNA genes. Finally, we show that the combination of high-precision DHSs and regulatory vocabularies markedly concentrate disease- and trait-associated non-coding genetic signals both along the genome and across cellular compartments. Taken together, our results provide a common and extensible coordinate system and vocabulary for human regulatory DNA, and a new global perspective on the architecture of human gene regulation.


2020 ◽  
Vol 23 (2) ◽  
pp. 113-120
Author(s):  
A. Athanassiadou

Determination of the DNA sequence of the human genome, revealing extensive genetic variation, and the mapping of the genes and the various regulatory elements of genome function within the genomic DNA, has revolutionized the way we view the states of health and disease in our time. Genetic complexity of the genome is manifested on different levels. The first level refers to the expression of protein coding genes, as regulated by their individual promoter in linear proximity. The next level of genetic complexity involves long distance action by far away enhancers, interacting with promoters through DNA looping. This 3- dimensional (3D) regulation is further developing by chromosome folding into the so called transcription factories, for fully physiological expression. Chromosome folding, mediated by specific genetic elements - insulators - is adding to the genetic complexity by facilitating movements of chromatin of specific genomic regions - the so-called topologically associated domains (TAD) in support of transcription and other cellular functions. Further genetic complexity has emerged with the finding that over 75% of the genome is transcribed and except of the coding genes, a plethora of RNA transcripts are produced - the non-coding RNA - that has important regulatory roles in the gene expression context. The great variation of genome sequence and regulatory elements of the genome architecture are exploited in studies of genome-wide association with disease, in the framework of Precision Medicine and in general of Genomic Medicine.


2016 ◽  
Vol 44 (4) ◽  
pp. 1073-1078 ◽  
Author(s):  
Rogerio Alves de Almeida ◽  
Marcin G. Fraczek ◽  
Steven Parker ◽  
Daniela Delneri ◽  
Raymond T. O'Keefe

Many human diseases have been attributed to mutation in the protein coding regions of the human genome. The protein coding portion of the human genome, however, is very small compared with the non-coding portion of the genome. As such, there are a disproportionate number of diseases attributed to the coding compared with the non-coding portion of the genome. It is now clear that the non-coding portion of the genome produces many functional non-coding RNAs and these RNAs are slowly being linked to human diseases. Here we discuss examples where mutation in classical non-coding RNAs have been attributed to human disease and identify the future potential for the non-coding portion of the genome in disease biology.


Plants ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1456
Author(s):  
Xin Jin ◽  
Can Baysal ◽  
Margit Drapal ◽  
Yanmin Sheng ◽  
Xin Huang ◽  
...  

Light is an essential regulator of many developmental processes in higher plants. We investigated the effect of 4-hydroxy-3-methylbut-2-enyl diphosphate reductase 1/2 genes (OsHDR1/2) and isopentenyl diphosphate isomerase 1/2 genes (OsIPPI1/2) on the biosynthesis of chlorophylls, carotenoids, and phytosterols in 14-day-old etiolated rice (Oyza sativa L.) leaves during de-etiolation. However, little is known about the effect of isoprenoid biosynthesis genes on the corresponding metabolites during the de-etiolation of etiolated rice leaves. The results showed that the levels of α-tocopherol were significantly increased in de-etiolated rice leaves. Similar to 1-deoxy-D-xylulose-5-phosphate synthase 3 gene (OsDXS3), both OsDXS1 and OsDXS2 genes encode functional 1-deoxy-D-xylulose-5-phosphate synthase (DXS) activities. Their expression patterns and the synthesis of chlorophyll, carotenoid, and tocopherol metabolites suggested that OsDXS1 is responsible for the biosynthesis of plastidial isoprenoids in de-etiolated rice leaves. The expression analysis of isoprenoid biosynthesis genes revealed that the coordinated expression of the MEP (2-C-methyl-D-erythritol 4-phosphate) pathway, chlorophyll, carotenoid, and tocopherol pathway genes mirrored the changes in the levels of the corresponding metabolites during de-etiolation. The underpinning mechanistic basis of coordinated light-upregulated gene expression was elucidated during the de-etiolation process, specifically the role of light-responsive cis-regulatory motifs in the promoter region of these genes. In silico promoter analysis showed that the light-responsive cis-regulatory elements presented in all the promoter regions of each light-upregulated gene, providing an important link between observed phenotype during de-etiolation and the molecular machinery controlling expression of these genes.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Svetlana Kalmykova ◽  
Marina Kalinina ◽  
Stepan Denisov ◽  
Alexey Mironov ◽  
Dmitry Skvortsov ◽  
...  

AbstractThe ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. Current knowledge on functional RNA structures is focused on locally-occurring base pairs. However, crosslinking and proximity ligation experiments demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved complementary regions (PCCRs) in human protein-coding genes. PCCRs tend to occur within introns, suppress intervening exons, and obstruct cryptic and inactive splice sites. Double-stranded structure of PCCRs is supported by decreased icSHAPE nucleotide accessibility, high abundance of RNA editing sites, and frequent occurrence of forked eCLIP peaks. Introns with PCCRs show a distinct splicing pattern in response to RNAPII slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. The enrichment of 3’-ends within PCCRs raises the intriguing hypothesis that coupling between RNA folding and splicing could mediate co-transcriptional suppression of premature pre-mRNA cleavage and polyadenylation.


Sign in / Sign up

Export Citation Format

Share Document