scholarly journals Learning a genome-wide score of human-mouse conservation at the functional genomics level

2020 ◽  
Author(s):  
Soo Bin Kwon ◽  
Jason Ernst

AbstractIdentifying genomic regions with functional genomic properties that are conserved between human and mouse is an important challenge in the context of mouse model studies. To address this, we take a novel approach and learn a score of evidence of conservation at the functional genomics level by integrating large-scale information in a compendium of epigenomic, transcription factor binding, and transcriptomic data from human and mouse. The computational method we developed to do this, Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), trains a neural network, which is then used to generate a genome-wide score in human and mouse. The resulting LECIF score highlights human and mouse regions with shared functional genomic properties and captures correspondence of biologically similar human and mouse annotations even though it was not explicitly given such information. LECIF will be a resource for mouse model studies.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Soo Bin Kwon ◽  
Jason Ernst

AbstractIdentifying genomic regions with functional genomic properties that are conserved between human and mouse is an important challenge in the context of mouse model studies. To address this, we develop a method to learn a score of evidence of conservation at the functional genomics level by integrating information from a compendium of epigenomic, transcription factor binding, and transcriptomic data from human and mouse. The method, Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), trains neural networks to generate this score for the human and mouse genomes. The resulting LECIF score highlights human and mouse regions with shared functional genomic properties and captures correspondence of biologically similar human and mouse annotations. Analysis with independent datasets shows the score also highlights loci associated with similar phenotypes in both species. LECIF will be a resource for mouse model studies by identifying loci whose functional genomic properties are likely conserved.


2021 ◽  
Author(s):  
Sabrina Lehmann ◽  
Bibi Atika ◽  
Daniela Grossmann ◽  
Christian Schmitt-Engel ◽  
Nadi Strohlein ◽  
...  

Abstract Background Functional genomics uses unbiased systematic genome-wide gene disruption or analyzes natural variations such as gene expression profiles of different tissues from multicellular organisms to link gene functions to particular phenotypes. Functional genomics approaches are of particular importance to identify large sets of genes that are specifically important for a particular biological process beyond known candidate genes, or when the process has not been studied with genetic methods before. Results Here, we present a large set of genes whose disruption interferes with the function of the odoriferous defensive stink glands of the red flour beetle Tribolium castaneum. This gene set is the result of a large-scale systematic phenotypic screen using a reverse genetics strategy based on RNA interference applied in a genome-wide forward genetics manner. In this first-pass screen, 130 genes were identified, of which 69 genes could be confirmed to cause knock-down gland phenotypes, which vary from necrotic tissue and irregular reservoir size to irregular color or separation of the secreted gland compounds. The knock-down of 13 genes caused specifically a strong reduction of para-benzoquinones, suggesting a specific function in the synthesis of these toxic compounds. Only 14 of the 69 confirmed gland genes are differentially overexpressed in stink gland tissue and thus could have been detected in a transcriptome-based analysis. Moreover, of the 29 previously transcriptomics-identified genes causing a gland phenotype, only one gene was recognized by this phenotypic screen despite the fact that 13 of them were covered by the screen. Conclusion Our results indicate the importance of combining diverse and independent methodologies to identify genes necessary for the function of a certain biological tissue, as the different approaches do not deliver redundant results but rather complement each other. The presented phenotypic screen together with a transcriptomics approach are now providing a set of close to hundred genes important for odoriferous defensive stink gland physiology in beetles.


2021 ◽  
Author(s):  
Heather R. Keys ◽  
Kristin A. Knouse

ABSTRACTOur ability to understand and modulate mammalian physiology and disease requires knowing how all genes contribute to any given phenotype in the organism. Genome-wide screening using CRISPR-Cas9 has emerged as a powerful method for the genetic dissection of cellular processes1,2, but the need to stably deliver single guide RNAs to millions of cells has restricted its implementation to ex vivo systems. These ex vivo systems cannot reproduce all of the cellular phenotypes observed in vivo nor can they recapitulate all of the factors that influence these phenotypes. There thus remains a pressing need for high-throughput functional genomics in a living organism. Here, we establish accessible genome-wide screening in the mouse liver and use this approach to uncover the complete regulation of cellular fitness in a living organism. We discover novel sex-specific and cell non-autonomous regulation of cell growth and viability. In particular, we find that the class I major histocompatibility complex is essential for preventing immune-mediated clearance of hepatocytes. Our approach provides the first comprehensive picture of cell fitness in a living organism and highlights the importance of investigating cellular phenomena in their native context. Our screening method is robust, scalable, and easily adapted to examine diverse cellular processes using any CRISPR application. We have hereby established a foundation for high-throughput functional genomics in a living mammal, enabling unprecedented insight into mammalian physiology and disease.


2020 ◽  
Author(s):  
Claire Marchal ◽  
Nivedita Singh ◽  
Ximena Corso-Díaz ◽  
Anand Swaroop

AbstractThree-dimensional (3D) conformation of the chromatin is crucial to stringently regulate gene expression patterns and DNA replication in a cell-type specific manner. HiC is a key technique for measuring 3D chromatin interactions genome wide. Estimating and predicting the resolution of a library is an essential step in any HiC experimental design. Here, we present the mathematical concepts to estimate the resolution of a library and predict whether deeper sequencing would enhance the resolution. We have developed HiCRes, a docker pipeline, by applying these concepts to human and mouse HiC libraries.


2019 ◽  
Vol 7 (6) ◽  
pp. 161 ◽  
Author(s):  
Ming-Hsin Tsai ◽  
Yen-Yi Liu ◽  
Von-Wun Soo ◽  
Chih-Chieh Chen

Microbial diversity has always presented taxonomic challenges. With the popularity of next-generation sequencing technology, more unculturable bacteria have been sequenced, facilitating the discovery of additional new species and complicated current microbial classification. The major challenge is to assign appropriate taxonomic names. Hence, assessing the consistency between taxonomy and genomic relatedness is critical. We proposed and applied a genome comparison approach to a large-scale survey to investigate the distribution of genomic differences among microorganisms. The approach applies a genome-wide criterion, homologous coverage ratio (HCR), for describing the homology between species. The survey included 7861 microbial genomes that excluded plasmids, and 1220 pairs of genera exhibited ambiguous classification. In this study, we also compared the performance of HCR and average nucleotide identity (ANI). The results indicated that HCR and ANI analyses yield comparable results, but a few examples suggested that HCR has a superior clustering effect. In addition, we used the Genome Taxonomy Database (GTDB), the gold standard for taxonomy, to validate our analysis. The GTDB offers 120 ubiquitous single-copy proteins as marker genes for species classification. We determined that the analysis of the GTDB still results in classification boundary blur between some genera and that the marker gene-based approach has limitations. Although the choice of marker genes has been quite rigorous, the bias of marker gene selection remains unavoidable. Therefore, methods based on genomic alignment should be considered for use for species classification in order to avoid the bias of marker gene selection. On the basis of our observations of microbial diversity, microbial classification should be re-examined using genome-wide comparisons.


2019 ◽  
Vol 11 (8) ◽  
pp. 2078-2098 ◽  
Author(s):  
Shu-Ye Jiang ◽  
Jingjing Jin ◽  
Rajani Sarojam ◽  
Srinivasan Ramachandran

Abstract Terpenes are organic compounds and play important roles in plant growth and development as well as in mediating interactions of plants with the environment. Terpene synthases (TPSs) are the key enzymes responsible for the biosynthesis of terpenes. Although some species were employed for the genome-wide identification and characterization of the TPS family, limited information is available regarding the evolution, expansion, and retention mechanisms occurring in this gene family. We performed a genome-wide identification of the TPS family members in 50 sequenced genomes. Additionally, we also characterized the TPS family from aromatic spearmint and basil plants using RNA-Seq data. No TPSs were identified in algae genomes but the remaining plant species encoded various numbers of the family members ranging from 2 to 79 full-length TPSs. Some species showed lineage-specific expansion of certain subfamilies, which might have contributed toward species or ecotype divergence or environmental adaptation. A large-scale family expansion was observed mainly in dicot and monocot plants, which was accompanied by frequent domain loss. Both tandem and segmental duplication significantly contributed toward family expansion and expression divergence and played important roles in the survival of these expanded genes. Our data provide new insight into the TPS family expansion and evolution and suggest that TPSs might have originated from isoprenyl diphosphate synthase genes.


Science ◽  
2019 ◽  
Vol 365 (6456) ◽  
pp. eaat7693 ◽  
Author(s):  
Andrea Ganna ◽  
Karin J. H. Verweij ◽  
Michel G. Nivard ◽  
Robert Maier ◽  
Robbee Wedow ◽  
...  

Twin and family studies have shown that same-sex sexual behavior is partly genetically influenced, but previous searches for specific genes involved have been underpowered. We performed a genome-wide association study (GWAS) on 477,522 individuals, revealing five loci significantly associated with same-sex sexual behavior. In aggregate, all tested genetic variants accounted for 8 to 25% of variation in same-sex sexual behavior, only partially overlapped between males and females, and do not allow meaningful prediction of an individual’s sexual behavior. Comparing these GWAS results with those for the proportion of same-sex to total number of sexual partners among nonheterosexuals suggests that there is no single continuum from opposite-sex to same-sex sexual behavior. Overall, our findings provide insights into the genetics underlying same-sex sexual behavior and underscore the complexity of sexuality.


2018 ◽  
Vol 116 (3) ◽  
pp. 900-908 ◽  
Author(s):  
Hamutal Arbel ◽  
Sumanta Basu ◽  
William W. Fisher ◽  
Ann S. Hammonds ◽  
Kenneth H. Wan ◽  
...  

Identifying functional enhancer elements in metazoan systems is a major challenge. Large-scale validation of enhancers predicted by ENCODE reveal false-positive rates of at least 70%. We used the pregrastrula-patterning network of Drosophila melanogaster to demonstrate that loss in accuracy in held-out data results from heterogeneity of functional signatures in enhancer elements. We show that at least two classes of enhancers are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, greater than 98% prediction accuracy can be achieved in a balanced, completely held-out test set. The class of well-predicted elements is composed predominantly of enhancers driving multistage segmentation patterns, which we designate segmentation driving enhancers (SDE). Prediction is driven by the DNA occupancy of early developmental transcription factors, with almost no additional power derived from histone modifications. We further show that improved accuracy is not a property of a particular prediction method: after conditioning on the SDE set, naïve Bayes and logistic regression perform as well as more sophisticated tools. Applying this method to a genome-wide scan, we predict 1,640 SDEs that cover 1.6% of the genome. An analysis of 32 SDEs using whole-mount embryonic imaging of stably integrated reporter constructs chosen throughout our prediction rank-list showed >90% drove expression patterns. We achieved 86.7% precision on a genome-wide scan, with an estimated recall of at least 98%, indicating high accuracy and completeness in annotating this class of functional elements.


2021 ◽  
Author(s):  
Poppy Channa Sakti Sephton-Clark ◽  
Jennifer Tenor ◽  
Dena Toffaletti ◽  
Nancy Meyers ◽  
Charles Giamberardino ◽  
...  

Cryptococcus neoformans is the causative agent of cryptococcosis, a disease with poor patient outcomes, accounting for approximately 180,000 deaths each year. Patient outcomes may be impacted by the underlying genetics of the infecting isolate, however, our current understanding of how genetic diversity contributes to clinical outcomes is limited. Here, we leverage clinical, in vitro growth and genomic data for 284 C. neoformans isolates to identify clinically relevant pathogen variants within a population of clinical isolates from patients with HIV-associated cryptococcosis in Malawi. Through a genome-wide association study (GWAS) approach, we identify variants associated with fungal burden and growth rate. We also find both small and large-scale variation, including aneuploidy, associated with alternate growth phenotypes, which may impact the course of infection. Genes impacted by these variants are involved in transcriptional regulation, signal transduction, glycolysis, sugar transport, and glycosylation. When combined with clinical data, we show that growth within the CNS is reliant upon glycolysis in an animal model, and likely impacts patient mortality, as CNS burden modulates patient outcome. Additionally, we find genes with roles in sugar transport are under selection in the majority of these clinical isolates. Further, we demonstrate that two hypothetical proteins identified by GWAS impact virulence in animal models. Our approach illustrates links between genetic variation and clinically relevant phenotypes, shedding light on survival mechanisms within the CNS and pathways involved in this persistence.


Sign in / Sign up

Export Citation Format

Share Document