scholarly journals Detecting cancer vulnerabilities through gene networks under purifying selection in 4,700 cancer genomes

2017 ◽  
Author(s):  
Anika Gupta ◽  
Heiko Horn ◽  
Parisa Razaz ◽  
April Kim ◽  
Michael Lawrence ◽  
...  

ABSTRACTLarge-scale cancer sequencing studies have uncovered dozens of mutations critical to cancer initiation and progression. However, a significant proportion of genes linked to tumor propagation remain hidden, often due to noise in sequencing data confounding low frequency alterations. Further, genes in networks under purifying selection (NPS), or those that are mutated in cancers less frequently than would be expected by chance, may play crucial roles in sustaining cancers but have largely been overlooked. We describe here a statistical framework that identifies genes that have a first order protein interaction network significantly depleted for mutations, to elucidate key genetic contributors to cancers. Not reliant on and thus, unbiased by, the gene of interest’s mutation rate, our approach has identified 685 putative genes linked to cancer development. Comparative analysis indicates statistically significant enrichment of NPS genes in previously validated cancer vulnerability gene sets, while further identifying novel cancer-specific candidate gene targets. As more tumor genomes are sequenced, integrating systems level mutation data through this network approach should become increasingly useful in pinpointing gene targets for cancer diagnosis and treatment.

2019 ◽  
Author(s):  
Mikhail V Pogorelyy ◽  
Mikhail Shugay

AbstractRecently developed molecular methods allow large-scale profiling of T-cell receptor (TCR) sequences that encode for antigen specificity and immunological memory of these cells. However, it is well known, that the even unperturbed TCR repertoire structure is extremely complex due to the high diversity of TCR rearrangements and multiple biases imprinted by VDJ rearrangement process. The latter gives rise to the phenomenon of “public” TCR clonotypes that can be shared across multiple individuals and non-trivial structure of the TCR similarity network. Here we outline a framework for TCR sequencing data analysis that can control for these biases in order to infer TCRs that are involved in response to antigens of interest. Using an example dataset of donors with known HLA haplotype and CMV status we demonstrate that by applying HLA restriction rules and matching against a database of TCRs with known antigen specificity it is possible to robustly detect motifs of an epitope-specific responses in individual repertoires. We also highlight potential shortcomings of TCR clustering methods and demonstrate that highly expanded TCRs should be individually assessed to get the full picture of antigen-specific response.


2017 ◽  
Author(s):  
Mark J.P. Chaisson ◽  
Ashley D. Sanders ◽  
Xuefang Zhao ◽  
Ankit Malhotra ◽  
David Porubsky ◽  
...  

ABSTRACTThe incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, and strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent–child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per human genome. We also discover 156 inversions per genome—most of which previously escaped detection. Fifty-eight of the inversions we discovered intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The method and the dataset serve as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.


2017 ◽  
Author(s):  
René Luijk ◽  
Koen F. Dekkers ◽  
Maarten van Iterson ◽  
Wibowo Arindrarto ◽  
Annique Claringbould ◽  
...  

ABSTRACTIdentification of causal drivers behind regulatory gene networks is crucial in understanding gene function. We developed a method for the large-scale inference of gene-gene interactions in observational population genomics data that are both directed (using local genetic instruments as causal anchors, akin to Mendelian Randomization) and specific (by controlling for linkage disequilibrium and pleiotropy). The analysis of genotype and whole-blood RNA-sequencing data from 3,072 individuals identified 49 genes as drivers of downstream transcriptional changes (P < 7 × 10−10), among which transcription factors were overrepresented (P = 3.3 × 10−7). Our analysis suggests new gene functions and targets including for SENP7 (zinc-finger genes involved in retroviral repression) and BCL2A1 (novel target genes possibly involved in auditory dysfunction). Our work highlights the utility of population genomics data in deriving directed gene expression networks. A resource of trans-effects for all 6,600 genes with a genetic instrument can be explored individually using a web-based browser.


2019 ◽  
Author(s):  
Farhan Ali ◽  
Aswin Sai Narain Seshasayee

AbstractThe evolution of bacterial regulatory networks has largely been explained at macroevolutionary scales through lateral gene transfer and gene duplication. Transcription factors (TF) have been found to be less conserved across species than their target genes (TG). This would be expected if TFs accumulate mutations faster than TGs. This hypothesis is supported by several lab evolution studies which found TFs, especially global regulators, to be frequently mutated. Despite these studies, the contribution of point mutations in TFs to the evolution of regulatory network is poorly understood. We tested if TFs show greater genetic variation than their TGs using whole-genome sequencing data from a large collection of E coli isolates. We found TFs to be less diverse, across natural isolates, due to their regulatory roles. TFs were enriched in mutations in multiple adaptive lab evolution studies but not in mutation accumulation. However, over long-term evolution, relative frequency of mutations in TFs showed a gradual decay after a rapid initial burst. Our results suggest that point mutations, conferring large-scale expression changes, may drive the early stages of adaptation but gene regulation is subjected to stronger purifying selection post adaptation.


2021 ◽  
Vol 118 (47) ◽  
pp. e2105395118
Author(s):  
Xiao Liu ◽  
David A. Leopold ◽  
Yifan Yang

The resting brain consumes enormous energy and shows highly organized spontaneous activity. To investigate how this activity is manifest among single neurons, we analyzed spiking discharges of ∼10,000 isolated cells recorded from multiple cortical and subcortical regions of the mouse brain during immobile rest. We found that firing of a significant proportion (∼70%) of neurons conformed to a ubiquitous, temporally sequenced cascade of spiking that was synchronized with global events and elapsed over timescales of 5 to 10 s. Across the brain, two intermixed populations of neurons supported orthogonal cascades. The relative phases of these cascades determined, at each moment, the response magnitude evoked by an external visual stimulus. Furthermore, the spiking of individual neurons embedded in these cascades was time locked to physiological indicators of arousal, including local field potential power, pupil diameter, and hippocampal ripples. These findings demonstrate that the large-scale coordination of low-frequency spontaneous activity, which is commonly observed in brain imaging and linked to arousal, sensory processing, and memory, is underpinned by sequential, large-scale temporal cascades of neuronal spiking across the brain.


2017 ◽  
Vol 3 (6) ◽  
pp. e200 ◽  
Author(s):  
Ralph D. Hector ◽  
Vera M. Kalscheuer ◽  
Friederike Hennig ◽  
Helen Leonard ◽  
Jenny Downs ◽  
...  

Objective:To provide new insights into the interpretation of genetic variants in a rare neurologic disorder, CDKL5 deficiency, in the contexts of population sequencing data and an updated characterization of the CDKL5 gene.Methods:We analyzed all known potentially pathogenic CDKL5 variants by combining data from large-scale population sequencing studies with CDKL5 variants from new and all available clinical cohorts and combined this with computational methods to predict pathogenicity.Results:The study has identified several variants that can be reclassified as benign or likely benign. With the addition of novel CDKL5 variants, we confirm that pathogenic missense variants cluster in the catalytic domain of CDKL5 and reclassify a purported missense variant as having a splicing consequence. We provide further evidence that missense variants in the final 3 exons are likely to be benign and not important to disease pathology. We also describe benign splicing and nonsense variants within these exons, suggesting that isoform hCDKL5_5 is likely to have little or no neurologic significance. We also use the available data to make a preliminary estimate of minimum incidence of CDKL5 deficiency.Conclusions:These findings have implications for genetic diagnosis, providing evidence for the reclassification of specific variants previously thought to result in CDKL5 deficiency. Together, these analyses support the view that the predominant brain isoform in humans (hCDKL5_1) is crucial for normal neurodevelopment and that the catalytic domain is the primary functional domain.


Biology ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 11 ◽  
Author(s):  
Rochelle Wickramasekara ◽  
Holly Stessman

Neurogenesis is an elegantly coordinated developmental process that must maintain a careful balance of proliferation and differentiation programs to be compatible with life. Due to the fine-tuning required for these processes, epigenetic mechanisms (e.g., DNA methylation and histone modifications) are employed, in addition to changes in mRNA transcription, to regulate gene expression. The purpose of this review is to highlight what we currently know about histone 4 lysine 20 (H4K20) methylation and its role in the developing brain. Utilizing publicly-available RNA-Sequencing data and published literature, we highlight the versatility of H4K20 methyl modifications in mediating diverse cellular events from gene silencing/chromatin compaction to DNA double-stranded break repair. From large-scale human DNA sequencing studies, we further propose that the lysine methyltransferase gene, KMT5B (OMIM: 610881), may fit into a category of epigenetic modifier genes that are critical for typical neurodevelopment, such as EHMT1 and ARID1B, which are associated with Kleefstra syndrome (OMIM: 610253) and Coffin-Siris syndrome (OMIM: 135900), respectively. Based on our current knowledge of the H4K20 methyl modification, we discuss emerging themes and interesting questions on how this histone modification, and particularly KMT5B expression, might impact neurodevelopment along with current challenges and potential avenues for future research.


2021 ◽  
Author(s):  
Xiao Liu ◽  
David A. Leopold ◽  
Yifan Yang

AbstractThe resting brain consumes enormous energy and shows highly organized spontaneous activity. To investigate how this activity is manifest among single neurons, we analyzed spiking discharges of ∼10,000 isolated cells recorded from multiple cortical and subcortical regions of the mouse brain during immobile rest. We found that firing of a significant proportion (∼70%) of neurons conformed to a ubiquitous, temporally sequenced cascade of spiking that was synchronized with global events and elapsed over timescales of 5-10 seconds. Across the brain, two intermixed populations of neurons supported orthogonal cascades. The relative phases of these cascades determined, at each moment, the response magnitude evoked by an external visual stimulus. Furthermore, the spiking of individual neurons embedded in these cascades was time locked to physiological indicators of arousal, including local field potential (LFP) power, pupil diameter, and hippocampal ripples. These findings demonstrate that the large-scale coordination of low-frequency spontaneous activity, which is commonly observed in brain imaging and linked to arousal, sensory processing, and memory, is underpinned by sequential, large-scale temporal cascades of neuronal spiking across the brain.


2018 ◽  
Author(s):  
Malika Kumar Freund ◽  
Kathryn Burch ◽  
Huwenbo Shi ◽  
Nicholas Mancuso ◽  
Gleb Kichaev ◽  
...  

ABSTRACTAlthough recent studies provide evidence for a common genetic basis between complex traits and Mendelian disorders, a thorough quantification of their overlap in a phenotype-specific manner remains elusive. Here, we quantify the overlap of genes identified through large-scale genome-wide association studies (GWAS) for 62 complex traits and diseases with genes known to cause 20 broad categories of Mendelian disorders. We identify a significant enrichment of phenotypically-matched Mendelian disorder genes in GWAS gene sets. Further, we observe elevated GWAS effect sizes near phenotypically-matched Mendelian disorder genes. Finally, we report examples of GWAS variants localized at the transcription start site or physically interacting with the promoters of phenotypically-matched Mendelian disorder genes. Our results are consistent with the hypothesis that genes that are disrupted in Mendelian disorders are dysregulated by noncoding variants in complex traits, and demonstrate how leveraging findings from related Mendelian disorders and functional genomic datasets can prioritize genes that are putatively dysregulated by local and distal non-coding GWAS variants.


2018 ◽  
Author(s):  
Elliott Rees ◽  
Noa Carrera ◽  
Joanne Morgan ◽  
Kirsty Hambridge ◽  
Valentina Escott-Price ◽  
...  

AbstractSequencing studies have highlighted candidate sets of genes involved in schizophrenia, including activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-d-aspartate receptor (NMDAR) complexes. Two genes, SETD1A and RBM12, have also been associated with robust statistical evidence. Larger samples and novel methods for identifying disease-associated missense variants are needed to reveal novel genes and biological mechanisms associated with schizophrenia. We sequenced 187 genes, selected for prior evidence of association with schizophrenia, in a new dataset of 5,207 cases and 4,991 controls. Included were members of ARC and NMDAR post-synaptic protein complexes, as well as voltage-gated sodium and calcium channels. We observed a significant case excess of rare (<0.1% in frequency) loss-of-function (LoF) mutations across all 187 genes (OR = 1.36; Pcorrected = 0.0072) but no individual gene was associated with schizophrenia after correcting for multiple testing. We found novel evidence that LoF and missense variants at paralog conserved sites were enriched in sodium channels (OR = 1.26; P = 0.0035). Meta-analysis of our new data with published sequencing data (11,319 cases, 15,854 controls and 1,136 trios) supported and refined this association to sodium channel alpha subunits (P = 0.0029). Meta-analysis also confirmed association between schizophrenia and rare variants in ARC (P = 4.0 × 10−4) and NMDAR (P = 1.7 × 10−5) synaptic genes. No association was found between rare variants in calcium channels and schizophrenia.In one of the largest sequencing studies of schizophrenia to date, we provide novel evidence that multiple voltage-gated sodium channels are involved in schizophrenia pathogenesis, and increase the evidence for association between rare variants in ARC and NMDAR post-synaptic complexes and schizophrenia. Larger samples are required to identify specific genes and variants driving these associations.Author SummaryCommon and rare genetic variations are known to play a substantial role in the development of schizophrenia. Recently, sequencing studies have started to highlight specific sets of genes that are enriched for rare variation in schizophrenia, such as the synaptic gene sets ARC and NMDAR, as well as voltage-gated sodium and calcium channels. To confirm the role of these gene sets in schizophrenia, and identify specific risk genes, we sequenced 187 genes in a new sample of 5,207 schizophrenia cases and 4,991 controls. We find an excess of protein truncating mutations with a frequency <0.1% in all 187 targeted genes, and provide novel evidence that mutations altering amino acids conserved across sodium channel proteins are risk factors for schizophrenia. Through meta-analysing our new data with previously published sequencing data sets, for a total of 11,319 cases, 15,854 controls and 1,136 trios, we increase the evidence for association between rare coding variants and schizophrenia in voltage-gated sodium channels, as well as in synaptic gene sets ARC and NMDAR. Although no individual gene was associated with schizophrenia, these findings suggest larger studies will identify the specific genes driving these associations.


Sign in / Sign up

Export Citation Format

Share Document