RefSeq Functional Elements as experimentally assayed nongenic reference standards and functional interactions in human and mouse

2021 ◽  
pp. gr.275819.121
Author(s):  
Catherine M Farrell ◽  
Tamara Goldfarb ◽  
Sanjida H Rangwala ◽  
Alexander Astashyn ◽  
Olga D Ermolaeva ◽  
...  

Eukaryotic genomes contain many nongenic elements that function in gene regulation, chromosome organization, recombination, repair or replication, and mutation of those elements can affect genome function and cause disease. While numerous epigenomic studies provide high coverage of gene regulatory regions, those data are not usually exposed in traditional genome annotation, and can be difficult to access and interpret without field-specific expertise. The National Center for Biotechnology Information (NCBI) therefore provides RefSeq Functional Elements (RefSeqFEs), which represent experimentally validated human and mouse nongenic elements derived from the literature. The curated dataset is comprised of richly annotated sequence records, descriptive records in the NCBI Gene database, reference genome feature annotation, and activity-based interactions between nongenic regions, target genes and each other. The dataset provides succinct functional details and transparent experimental evidence, leverages data from multiple experimental sources, is readily accessible and adaptable, and utilizes a flexible data model. The data have multiple uses for basic functional discovery, bioinformatics studies, genetic variant interpretation, as known positive controls for epigenomic data evaluation, and as reference standards for functional interactions. Comparisons to other gene regulatory datasets show that the RefSeqFE dataset includes a wider range of feature types representing more areas of biology, but it is comparatively smaller and subject to data selection biases. RefSeqFEs thus provide an alternative and complementary resource for experimentally assayed functional elements, with future dataset growth expected.

2019 ◽  
Vol 21 (4) ◽  
pp. 1465-1478 ◽  
Author(s):  
Aimin Li ◽  
Peilin Jia ◽  
Saurav Mallik ◽  
Rong Fei ◽  
Hiroki Yoshioka ◽  
...  

Abstract Cleft palate (CP) is the second most common congenital birth defect. The etiology of CP is complicated, with involvement of various genetic and environmental factors. To investigate the gene regulatory mechanisms, we designed a powerful regulatory analytical approach to identify the conserved regulatory networks in humans and mice, from which we identified critical microRNAs (miRNAs), target genes and regulatory motifs (miRNA–TF–gene) related to CP. Using our manually curated genes and miRNAs with evidence in CP in humans and mice, we constructed miRNA and transcription factor (TF) co-regulation networks for both humans and mice. A consensus regulatory loop (miR17/miR20a–FOXE1–PDGFRA) and eight miRNAs (miR-140, miR-17, miR-18a, miR-19a, miR-19b, miR-20a, miR-451a and miR-92a) were discovered in both humans and mice. The role of miR-140, which had the strongest association with CP, was investigated in both human and mouse palate cells. The overexpression of miR-140-5p, but not miR-140-3p, significantly inhibited cell proliferation. We further examined whether miR-140 overexpression could suppress the expression of its predicted target genes (BMP2, FGF9, PAX9 and PDGFRA). Our results indicated that miR-140-5p overexpression suppressed the expression of BMP2 and FGF9 in cultured human palate cells and Fgf9 and Pdgfra in cultured mouse palate cells. In summary, our conserved miRNA–TF–gene regulatory network approach is effective in detecting consensus miRNAs, motifs, and regulatory mechanisms in human and mouse CP.


2020 ◽  
Vol 48 (W1) ◽  
pp. W275-W286 ◽  
Author(s):  
Anjun Ma ◽  
Cankun Wang ◽  
Yuzhou Chang ◽  
Faith H Brennan ◽  
Adam McDermaid ◽  
...  

Abstract A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.


2018 ◽  
Vol 47 (D1) ◽  
pp. D140-D144 ◽  
Author(s):  
Liang Cheng ◽  
Pingping Wang ◽  
Rui Tian ◽  
Song Wang ◽  
Qinghua Guo ◽  
...  

2021 ◽  
Author(s):  
Sreemol Gokuladhas ◽  
William Schierding ◽  
Roan Eltigani Zaied ◽  
Tayaza Fadason ◽  
Murim Choi ◽  
...  

Background & Aims: Non-alcoholic fatty liver disease (NAFLD) is a multi-system metabolic disease that co-occurs with various hepatic and extra-hepatic diseases. The phenotypic manifestation of NAFLD is primarily observed in the liver. Therefore, identifying liver-specific gene regulatory interactions between variants associated with NAFLD and multimorbid conditions may help to improve our understanding of underlying shared aetiology. Methods: Here, we constructed a liver-specific gene regulatory network (LGRN) consisting of genome-wide spatially constrained expression quantitative trait loci (eQTLs) and their target genes. The LGRN was used to identify regulatory interactions involving NAFLD-associated genetic modifiers and their inter-relationships to other complex traits. Results and Conclusions: We demonstrate that MBOAT7 and IL32, which are associated with NAFLD progression, are regulated by spatially constrained eQTLs that are enriched for an association with liver enzyme levels. MBOAT7 transcript levels are also linked to eQTLs associated with cirrhosis, and other traits that commonly co-occur with NAFLD. In addition, genes that encode interacting partners of NAFLD-candidate genes within the liver-specific protein-protein interaction network were affected by eQTLs enriched for phenotypes relevant to NAFLD (e.g. IgG glycosylation patterns, OSA). Furthermore, we identified distinct gene regulatory networks formed by the NAFLD-associated eQTLs in normal versus diseased liver, consistent with the context-specificity of the eQTLs effects. Interestingly, genes targeted by NAFLD-associated eQTLs within the LGRN were also affected by eQTLs associated with NAFLD-related traits (e.g. obesity and body fat percentage). Overall, the genetic links identified between these traits expand our understanding of shared regulatory mechanisms underlying NAFLD multimorbidities.


2021 ◽  
Author(s):  
Jieun Jeong ◽  
Manolis Kellis

We assembled a panel of 28 tissue pairs of human and mouse with RNA-Seq data on gene expression. We focused on genes with no 1-to-1 homology, because they pose special challenges. In this way, we identified expression patterns that identify and explain differences between the two species and suggest target genes for therapeutic applications. Here we mention three examples. One pattern is observed by defining the aggregate expression of immunoglobulin genes (which have no homology) as a measure of different levels of an immune response. In Lung, we used this statistic to find genes that have significantly higher expression in low/moderate response, and thus they may be therapy targets: increasing their expression or mimicking their function with medications may help in recovery from inflammation in the lungs. Some of the observed associations are common to human and mouse; other associations involve genes involved in cell-to-cell signaling or in regeneration but were not known to be important in Lung. Second pattern is that in the Small Intestine, mouse expresses much less antimicrobial defensins, while it has much higher expression of enzymes that are found to improve adaptive immune response. Such enzymes may be tested if they improve probiotic supplements that help in gut inflammation and other diseases. Another pattern involves a many-to-many homology group of defensins that did not have a described function. In human tissues, expression of its genes was found only in a study of a disease of hair covered skin, but several of its genes are highly expressed in two tissues of our panel: mouse Skin and to a lesser degree mouse Vagina. This suggests that those genes or their homologs in other species may provide non-antibiotic medications for hair covered skin and other tissues with microbiome that includes fungi.


2019 ◽  
Author(s):  
Joanna Mitchelmore ◽  
Nastasiya Grinberg ◽  
Chris Wallace ◽  
Mikhail Spivakov

AbstractIdentifying DNA cis-regulatory modules (CRMs) that control the expression of specific genes is crucial for deciphering the logic of transcriptional control. Natural genetic variation can point to the possible gene regulatory function of specific sequences through their allelic associations with gene expression. However, comprehensive identification of causal regulatory sequences in brute-force association testing without incorporating prior knowledge is challenging due to limited statistical power and effects of linkage disequilibrium. Sequence variants affecting transcription factor (TF) binding at CRMs have a strong potential to influence gene regulatory function, which provides a motivation for prioritising such variants in association testing. Here, we generate an atlas of CRMs showing predicted allelic variation in TF binding affinity in human lymphoblastoid cell lines (LCLs) and test their association with the expression of their putative target genes inferred from Promoter Capture Hi-C and immediate linear proximity. We reveal over 1300 CRM TF-binding variants associated with target gene expression, the majority of them undetected with standard association testing. A large proportion of CRMs showing associations with the expression of genes they contact in 3D localise to the promoter regions of other genes, supporting the notion of ‘epromoters’: dual-action CRMs with promoter and distal enhancer activity.


Development ◽  
1998 ◽  
Vol 125 (19) ◽  
pp. 3887-3894 ◽  
Author(s):  
E.S. Casey ◽  
M.A. O'Reilly ◽  
F.L. Conlon ◽  
J.C. Smith

Brachyury is a member of the T-box gene family and is required for formation of posterior mesoderm and notochord during vertebrate development. The ability of Brachyury to activate transcription is essential for its biological function, but nothing is known about its target genes. Here we demonstrate that Xenopus Brachyury directly regulates expression of eFGF by binding to an element positioned approximately 1 kb upstream of the eFGF transcription start site. This site comprises half of the palindromic sequence previously identified by binding site selection and is also present in the promoters of the human and mouse homologues of eFGF.


Author(s):  
Shuang Deng ◽  
Hongwan Zhang ◽  
Kaiyu Zhu ◽  
Xingyang Li ◽  
Ying Ye ◽  
...  

Abstract N6-methyladenosine (m6A) is the most abundant posttranscriptional modification in mammalian mRNA molecules and has a crucial function in the regulation of many fundamental biological processes. The m6A modification is a dynamic and reversible process regulated by a series of writers, erasers and readers (WERs). Different WERs might have different functions, and even the same WER might function differently in different conditions, which are mostly due to different downstream genes being targeted by the WERs. Therefore, identification of the targets of WERs is particularly important for elucidating this dynamic modification. However, there is still no public repository to host the known targets of WERs. Therefore, we developed the m6A WER target gene database (m6A2Target) to provide a comprehensive resource of the targets of m6A WERs. M6A2Target provides a user-friendly interface to present WER targets in two different modules: ‘Validated Targets’, referred to as WER targets identified from low-throughput studies, and ‘Potential Targets’, including WER targets analyzed from high-throughput studies. Compared to other existing m6A-associated databases, m6A2Target is the first specific resource for m6A WER target genes. M6A2Target is freely accessible at http://m6a2target.canceromics.org.


2016 ◽  
Vol 113 (13) ◽  
pp. E1835-E1843 ◽  
Author(s):  
Mina Fazlollahi ◽  
Ivor Muroff ◽  
Eunjee Lee ◽  
Helen C. Causton ◽  
Harmen J. Bussemaker

Regulation of gene expression by transcription factors (TFs) is highly dependent on genetic background and interactions with cofactors. Identifying specific context factors is a major challenge that requires new approaches. Here we show that exploiting natural variation is a potent strategy for probing functional interactions within gene regulatory networks. We developed an algorithm to identify genetic polymorphisms that modulate the regulatory connectivity between specific transcription factors and their target genes in vivo. As a proof of principle, we mapped connectivity quantitative trait loci (cQTLs) using parallel genotype and gene expression data for segregants from a cross between two strains of the yeast Saccharomyces cerevisiae. We identified a nonsynonymous mutation in the DIG2 gene as a cQTL for the transcription factor Ste12p and confirmed this prediction empirically. We also identified three polymorphisms in TAF13 as putative modulators of regulation by Gcn4p. Our method has potential for revealing how genetic differences among individuals influence gene regulatory networks in any organism for which gene expression and genotype data are available along with information on binding preferences for transcription factors.


2007 ◽  
Vol 27 (6) ◽  
pp. 2155-2165 ◽  
Author(s):  
Parviz Minoo ◽  
Lingyan Hu ◽  
Yiming Xing ◽  
Nian Ling Zhu ◽  
Hongyan Chen ◽  
...  

ABSTRACT NKX2.1 is a homeodomain transcription factor that controls development of the brain, lung, and thyroid. In the lung, Nkx2.1 is expressed in a proximo-distal gradient and activates specific genes in phenotypically distinct epithelial cells located along this axis. The mechanisms by which NKX2.1 controls its target genes may involve interactions with other transcription factors. We examined whether NKX2.1 interacts with members of the winged-helix/forkhead family of FOXA transcription factors to regulate two spatially and cell type-specific genes, SpC and Ccsp. The results show that NKX2.1 interacts physically and functionally with FOXA1. The nature of the interaction is inhibitory and occurs through the NKX2.1 homeodomain in a DNA-independent manner. On SpC, which lacks a FOXA1 binding site, FOXA1 attenuates NKX2.1-dependent transcription. Inhibition of FOXA1 by small interfering RNA increased SpC mRNA, demonstrating the in vivo relevance of this finding. In contrast, FOXA1 and NKX2.1 additively activate transcription from Ccsp, which includes both NKX2.1 and FOXA1 binding sites. In electrophoretic mobility shift assays, the GST-FOXA1 fusion protein interferes with the formation of NKX2.1 transcriptional complexes by potentially masking the latter's homeodomain DNA binding function. These findings suggest a novel mode of selective gene regulation by proximo-distal gradient distribution of and functional interactions between forkhead and homeodomain transcription factors.


Sign in / Sign up

Export Citation Format

Share Document