Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes

AbstractInsertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.

Download Full-text

CANCERSIGN: a user-friendly and robust tool for identification and classification of mutational signatures and patterns in cancer genomes

10.1101/424960 ◽

2018 ◽

Cited By ~ 1

Author(s):

Masroor Bayati ◽

Hamid Reza Rabiee ◽

Mehrdad Mehrbod ◽

Fatemeh Vafaee ◽

Diako Ebrahimi ◽

...

Keyword(s):

Somatic Mutation ◽

Whole Genome ◽

Mutational Signatures ◽

Coding Regions ◽

Whole Exome ◽

Cancer Genomes ◽

Genome Consortium ◽

User Friendly ◽

Pooled Samples

Analyses of large somatic mutation datasets, using advanced computational algorithms, have revealed at least 30 independent mutational signatures in tumor samples. These studies have been instrumental in identification and quantification of responsible endogenous and exogenous molecular processes in cancer. The quantitative approach used to deconvolute mutational signatures is becoming an integral part of cancer research. Therefore, development of a stand-alone tool with a user-friendly graphical interface for analysis of cancer mutational signatures is necessary. In this manuscript, we introduce CANCERSIGN as an open access bioinformatics tool that uses raw mutation data (BED files) as input, and identifies 3-mer and 5-mer mutational signatures. CANCERSIGN enables users to identify signatures within whole genome, whole exome or pooled samples. It can also identify signatures in specific regions of the genome (defined by user). Additionally, this tool enables users to perform clustering on tumor samples based on the raw mutation counts as well as using the proportion of mutational signatures in each sample. Using this tool, we analysed all the whole genome somatic mutation datasets profiled by the International Cancer Genome Consortium (ICGC) and identified a number of novel signatures. By examining signatures found in exonic and non-exonic regions of the genome using WGS and comparing this to signatures found in WES data we observe that WGS can identify additional non-exonic signatures that are enriched in the non-coding regions of the genome while the deeper sequencing of WES may help identify weak signatures that are otherwise missed in shallower WGS data.

Download Full-text

The "Core" of the Dark Triad: A test of competing hypotheses

10.31234/osf.io/gmh7p ◽

2019 ◽

Cited By ~ 1

Author(s):

Colin Vize ◽

Donald Lynam ◽

Katherine Collison ◽

Josh Miller

Keyword(s):

Dark Triad ◽

The Core ◽

Large Samples ◽

Parsimonious Explanation ◽

Shared Variance ◽

Competing Hypotheses ◽

Callous Affect

As research on the Dark Triad (DT; the interrelated constructs of Machiavellianism, narcissism, and psychopathy) has accumulated, a subset of this research has focused on explicating what traits may account for the overlap among the DT members. Various candidate traits have been investigated, with evidence supporting several of them including Antagonism (vs. Agreeableness), Honesty-Humility, and Callousness and Interpersonal Manipulation (the latter two as a set). The present study sought to test the leading candidates against one another in their ability to account for the shared variance among the DT members. Using a pre-registered analytical plan, we found that Agreeableness (as measured by the IPIP-NEO-120), Honesty-Humility from the HEXACO, and the SRP-III subscales of Callous Affect and Interpersonal Manipulation accounted for all or nearly all of the shared variance among the DT members. BFI-based measures of Agreeableness (BFI and BFI-2) accounted for notably less variance in most cases. The results were consistent across two large samples (Ns of 627 and 628), and across various DT measurement approaches. We argue that the most parsimonious explanation for findings on the core of the DT is that such traits all fall under the umbrella of Antagonism.

Download Full-text

Predicting pathogenic non-coding SVs disrupting the 3D genome in 1646 whole cancer genomes using multiple instance learning

Scientific Reports ◽

10.1038/s41598-021-93917-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Marleen M. Nieboer ◽

Luan Nguyen ◽

Jeroen de Ridder

Keyword(s):

Multiple Instance Learning ◽

Cancer Diagnostics ◽

Common Mechanism ◽

Open Chromatin ◽

Driver Genes ◽

3D Genome ◽

Whole Genomes ◽

Cancer Genomes ◽

Cancer Types ◽

The Impact

AbstractOver the past years, large consortia have been established to fuel the sequencing of whole genomes of many cancer patients. Despite the increased abundance in tools to study the impact of SNVs, non-coding SVs have been largely ignored in these data. Here, we introduce svMIL2, an improved version of our Multiple Instance Learning-based method to study the effect of somatic non-coding SVs disrupting boundaries of TADs and CTCF loops in 1646 cancer genomes. We demonstrate that svMIL2 predicts pathogenic non-coding SVs with an average AUC of 0.86 across 12 cancer types, and identifies non-coding SVs affecting well-known driver genes. The disruption of active (super) enhancers in open chromatin regions appears to be a common mechanism by which non-coding SVs exert their pathogenicity. Finally, our results reveal that the contribution of pathogenic non-coding SVs as opposed to driver SNVs may highly vary between cancers, with notably high numbers of genes being disrupted by pathogenic non-coding SVs in ovarian and pancreatic cancer. Taken together, our machine learning method offers a potent way to prioritize putatively pathogenic non-coding SVs and leverage non-coding SVs to identify driver genes. Moreover, our analysis of 1646 cancer genomes demonstrates the importance of including non-coding SVs in cancer diagnostics.

Download Full-text

Structures of Rhodopseudomonas palustris RC-LH1 complexes with open or closed quinone channels

Science Advances ◽

10.1126/sciadv.abe2631 ◽

2021 ◽

Vol 7 (3) ◽

pp. eabe2631

Author(s):

David J. K. Swainsbury ◽

Pu Qian ◽

Philip J. Jackson ◽

Kaitlyn M. Faries ◽

Dariusz M. Niedzwiedzki ◽

...

Keyword(s):

Binding Sites ◽

Rhodopseudomonas Palustris ◽

Ring Closure ◽

Light Harvesting Complex ◽

The Core ◽

Quinone Binding ◽

Cryo Electron Microscopy ◽

Unique Protein ◽

Center Light ◽

Resolution Structure

The reaction-center light-harvesting complex 1 (RC-LH1) is the core photosynthetic component in purple phototrophic bacteria. We present two cryo–electron microscopy structures of RC-LH1 complexes from Rhodopseudomonas palustris. A 2.65-Å resolution structure of the RC-LH114-W complex consists of an open 14-subunit LH1 ring surrounding the RC interrupted by protein-W, whereas the complex without protein-W at 2.80-Å resolution comprises an RC completely encircled by a closed, 16-subunit LH1 ring. Comparison of these structures provides insights into quinone dynamics within RC-LH1 complexes, including a previously unidentified conformational change upon quinone binding at the RC QB site, and the locations of accessory quinone binding sites that aid their delivery to the RC. The structurally unique protein-W prevents LH1 ring closure, creating a channel for accelerated quinone/quinol exchange.

Download Full-text

Secondary Structural Model of MALAT1 Becomes Unstructured in Chronic Myeloid Leukemia and Undergoes Structural Rearrangement in Cervical Cancer

Non-Coding RNA ◽

10.3390/ncrna7010006 ◽

2021 ◽

Vol 7 (1) ◽

pp. 6

Author(s):

Matthew C. Wang ◽

Phillip J. McCown ◽

Grace E. Schiefelbein ◽

Jessica A. Brown

Keyword(s):

Cervical Cancer ◽

Chronic Myeloid Leukemia ◽

Hela Cells ◽

Binding Sites ◽

Myeloid Leukemia ◽

Structural Changes ◽

Structural Model ◽

Dimethyl Sulfate ◽

K562 Cells ◽

Cancer Types

Long noncoding RNAs (lncRNAs) influence cellular function through binding events that often depend on the lncRNA secondary structure. One such lncRNA, metastasis-associated lung adenocarcinoma transcript 1 (MALAT1), is upregulated in many cancer types and has a myriad of protein- and miRNA-binding sites. Recently, a secondary structural model of MALAT1 in noncancerous cells was proposed to form 194 hairpins and 13 pseudoknots. That study postulated that, in cancer cells, the MALAT1 structure likely varies, thereby influencing cancer progression. This work analyzes how that structural model is expected to change in K562 cells, which originated from a patient with chronic myeloid leukemia (CML), and in HeLa cells, which originated from a patient with cervical cancer. Dimethyl sulfate-sequencing (DMS-Seq) data from K562 cells and psoralen analysis of RNA interactions and structure (PARIS) data from HeLa cells were compared to the working structural model of MALAT1 in noncancerous cells to identify sites that likely undergo structural alterations. MALAT1 in K562 cells is predicted to become more unstructured, with almost 60% of examined hairpins in noncancerous cells losing at least half of their base pairings. Conversely, MALAT1 in HeLa cells is predicted to largely maintain its structure, undergoing 18 novel structural rearrangements. Moreover, 50 validated miRNA-binding sites are affected by putative secondary structural changes in both cancer types, such as miR-217 in K562 cells and miR-20a in HeLa cells. Structural changes unique to K562 cells and HeLa cells provide new mechanistic leads into how the structure of MALAT1 may mediate cancer in a cell-type specific manner.

Download Full-text

Identification and Characterization of the Genes Encoding the Core Histones and Histone Variants of Neurospora crassa

Genetics ◽

10.1093/genetics/160.3.961 ◽

2002 ◽

Vol 160 (3) ◽

pp. 961-973 ◽

Cited By ~ 1

Author(s):

Shan M Hays ◽

Johanna Swanson ◽

Eric U Selker

Keyword(s):

Neurospora Crassa ◽

Histone Variants ◽

Null Alleles ◽

Histone Genes ◽

Histone H4 ◽

Coding Regions ◽

The Core ◽

Genomic Arrangement ◽

Genes Encoding ◽

Core Histones

Abstract We have identified and characterized the complete complement of genes encoding the core histones of Neurospora crassa. In addition to the previously identified pair of genes that encode histones H3 and H4 (hH3 and hH4-1), we identified a second histone H4 gene (hH4-2), a divergently transcribed pair of genes that encode H2A and H2B (hH2A and hH2B), a homolog of the F/Z family of H2A variants (hH2Az), a homolog of the H3 variant CSE4 from Saccharomyces cerevisiae (hH3v), and a highly diverged H4 variant (hH4v) not described in other species. The hH4-1 and hH4-2 genes, which are 96% identical in their coding regions and encode identical proteins, were inactivated independently. Strains with inactivating mutations in either gene were phenotypically wild type, in terms of growth rates and fertility, but the double mutants were inviable. As expected, we were unable to isolate null alleles of hH2A, hH2B, or hH3. The genomic arrangement of the histone and histone variant genes was determined. hH2Az and the hH3-hH4-1 gene pair are on LG IIR, with hH2Az centromere-proximal to hH3-hH4-1 and hH3 centromere-proximal to hH4-1. hH3v and hH4-2 are on LG IIIR with hH3v centromere-proximal to hH4-2. hH4v is on LG IVR and the hH2A-hH2B pair is located immediately right of the LG VII centromere, with hH2A centromere-proximal to hH2B. Except for the centromere-distal gene in the pairs, all of the histone genes are transcribed toward the centromere. Phylogenetic analysis of the N. crassa histone genes places them in the Euascomycota lineage. In contrast to the general case in eukaryotes, histone genes in euascomycetes are few in number and contain introns. This may be a reflection of the evolution of the RIP (repeat-induced point mutation) and MIP (methylation induced premeiotically) processes that detect sizable duplications and silence associated genes.

Download Full-text

Untranslated regions of mRNA and their role in regulation of gene expression in protozoan parasites

Journal of Biosciences ◽

10.1007/s12038-016-9660-7 ◽

2017 ◽

Vol 42 (1) ◽

pp. 189-207 ◽

Cited By ~ 2

Author(s):

Shilpa J Rao ◽

Sangeeta Chatterjee ◽

Jayanta K Pal

Keyword(s):

Gene Expression ◽

Regulation Of Gene Expression ◽

Untranslated Regions ◽

Protozoan Parasites

Download Full-text

The search of CAR, AhR, ESRs binding sites in promoters of intronic and intergenic microRNAs

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720017500299 ◽

2018 ◽

Vol 16 (01) ◽

pp. 1750029 ◽

Cited By ~ 6

Author(s):

Vladimir Y. Ovchinnikov ◽

Denis V. Antonets ◽

Lyudmila F. Gulyaeva

Keyword(s):

Transcriptional Activation ◽

Binding Sites ◽

In Silico ◽

Target Genes ◽

Nuclear Translocation ◽

Regulation Of Gene Expression ◽

Experimental Studies ◽

Transcriptional Level ◽

Aryl Hydrocarbon ◽

Exogenous Compounds

MicroRNAs (miRNAs) play important roles in the regulation of gene expression at the post-transcriptional level. Many exogenous compounds or xenobiotics may affect microRNA expression. It is a well-established fact that xenobiotics with planar structure like TCDD, benzo(a)pyrene (BP) can bind aryl hydrocarbon receptor (AhR) followed by its nuclear translocation and transcriptional activation of target genes. Another chemically diverse group of xenobiotics including phenobarbital, DDT, can activate the nuclear receptor CAR and in some cases estrogen receptors ESR1 and ESR2. We hypothesized that such chemicals can affect miRNA expression through the activation of AHR, CAR, and ESRs. To prove this statement, we used in silico methods to find DRE, PBEM, ERE potential binding sites for these receptors, respectively. We have predicted AhR, CAR, and ESRs binding sites in 224 rat, 201 mouse, and 232 human promoters of miRNA-coding genes. In addition, we have identified a number of miRNAs with predicted AhR, CAR, and ESRs binding sites that are known as oncogenes and as tumor suppressors. Our results, obtained in silico, open a new strategy for ongoing experimental studies and will contribute to further investigation of epigenetic mechanisms of carcinogenesis.

Download Full-text

Structure and expression of canary myc family genes

Molecular and Cellular Biology ◽

10.1128/mcb.11.3.1770-1776.1991 ◽

1991 ◽

Vol 11 (3) ◽

pp. 1770-1776

Author(s):

R G Collum ◽

D F Clayton ◽

F W Alt

Keyword(s):

Untranslated Region ◽

Untranslated Regions ◽

Coding Region ◽

Protein Coding ◽

Coding Regions ◽

Neuronal Precursors ◽

Myc Gene ◽

Mature Neurons

We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons.

Download Full-text

Sex Differences in Cancer Genomes: Much Learned, More Unknown

Endocrinology ◽

10.1210/endocr/bqab170 ◽

2021 ◽

Author(s):

Chenghao Zhu ◽

Paul C Boutros

Keyword(s):

Sex Differences ◽

Precision Medicine ◽

Cause Of Death ◽

Clinical Presentation ◽

Response To Treatment ◽

Critical Gap ◽

Cancer Genomes ◽

Cancer Initiation ◽

Cancer Types ◽

Key Resources

Abstract Cancer is a leading cause of death worldwide. Sex influences cancer in a bewildering variety of ways. In some cancer types it affects prevalence, in others genomic profiles, or response to treatment, or mortality. In some sex seems to have little or no influence. How and when sex influences cancer initiation and progression remain a critical gap in our understanding of cancer, with direct relevance to precision medicine. Here, we note several factors that complicate our understanding of sex differences: representativeness of large cohorts, confounding with features like ancestry, age and obesity, and variability in clinical presentation. We summarize the key resources available to study molecular sex differences, and suggest some likely directions for improving our understanding of how patient sex influences cancer behaviour.

Download Full-text