Genome Research
Latest Publications


TOTAL DOCUMENTS

5484
(FIVE YEARS 549)

H-INDEX

284
(FIVE YEARS 24)

Published By Cold Spring Harbor Laboratory

1088-9051, 1088-9051

2022 ◽  
pp. gr.276103.121
Author(s):  
Daniel Melamed ◽  
Yuval Nov ◽  
Assaf Malik ◽  
Michael B Yakass ◽  
Evgeni Bolotin ◽  
...  

While it is known that the mutation rate varies across the genome, previous estimates were based on averaging across various numbers of positions. Here we describe a method to measure the origination rates of target mutations at target base positions and apply it to a 6-bp region in the human hemoglobin subunit beta (HBB) gene and to the identical, paralogous hemoglobin subunit delta (HBD) region in sperm cells from both African and European donors. The HBB region of interest (ROI) includes the site of the hemoglobin S (HbS) mutation, which protects against malaria, is common in Africa and has served as a classic example of adaptation by random mutation and natural selection. We found a significant correspondence between de novo mutation rates and past observations of alleles in carriers, showing that mutation rates vary substantially in a mutation-specific manner that contributes to the site frequency spectrum. We also found that the overall point mutation rate is significantly higher in Africans than in Europeans in the HBB region studied. Finally, the rate of the 20A→T mutation, called the 'HbS mutation' when it appears in HBB, is significantly higher than expected from the genome-wide average for this mutation type. Nine instances were observed in the African HBB ROI, where it is of adaptive significance, representing at least three independent originations; no instances were observed elsewhere. Further studies will be needed to examine mutation rates at the single-mutation resolution across these and other loci and organisms and to uncover the molecular mechanisms responsible.


2022 ◽  
pp. gr.275533.121
Author(s):  
Tyler A Joseph ◽  
Philippe Chlenski ◽  
Aviya Litman ◽  
Tal Korem ◽  
Itsik Pe'er

Patterns of sequencing coverage along a bacterial genome---summarized by a peak-to-trough ratio (PTR)---have been shown to accurately reflect microbial growth rates, revealing a new facet of microbial dynamics and host-microbe interactions. Here, we introduce CoPTR (Compute PTR): a tool for computing PTRs from complete reference genomes and assemblies. Using simulations and data from growth experiments in simple and complex communities, we show that CoPTR is more accurate than the current state-of-the-art, while also providing more PTR estimates overall. We further develop theory formalizing a biological interpretation for PTRs. Using a reference database of 2935 species, we applied CoPTR to a case-control study of 1304 metagenomic samples from 106 individuals with inflammatory bowel disease. We show that growth rates are personalized, are only loosely correlated with relative abundances, and are associated with disease status. We conclude by demonstrating how PTRs can be combined with relative abundances and metabolomics to investigate their effect on the microbiome.


2022 ◽  
pp. gr.275655.121
Author(s):  
Ni-Chen Chang ◽  
Quirze Rovira ◽  
Jonathan N Wells ◽  
Cedric Feschotte ◽  
Juan M Vaquerizas

There is considerable interest in understanding the effect of transposable elements (TEs) on embryonic development. Studies in humans and mice are limited by the difficulty of working with mammalian embryos, and by the relative scarcity of active TEs in these organisms. Zebrafish is an outstanding model for the study of vertebrate development and over half of its genome consists of diverse TEs. However, zebrafish TEs remain poorly characterized. Here we describe the demography and genomic distribution of zebrafish TEs and their expression throughout embryogenesis using bulk and single-cell RNA sequencing data. These results reveal a highly dynamic genomic ecosystem comprising nearly 2,000 distinct TE families, which vary in copy number by four orders of magnitude and span a wide range of ages. Longer retroelements tend to be retained in intergenic regions, whilst short interspersed nuclear elements (SINEs) and DNA transposons are more frequently found nearby or within genes. Locus-specific mapping of TE expression reveals extensive TE transcription during development. While two thirds of TE transcripts are likely driven by nearby gene promoters, we still observe stage and tissue-specific expression patterns in self-regulated TEs. Long terminal repeat (LTR) retroelements are most transcriptionally active immediately following zygotic genome activation, whereas DNA transposons are enriched amongst transcripts expressed in later stages of development. Single-cell analysis reveals several endogenous retroviruses expressed in specific somatic cell lineages. Overall, our study provides a valuable resource for using zebrafish as a model to study the impact of TEs on vertebrate development.


2021 ◽  
pp. gr.275837.121
Author(s):  
Xiangxiu Wang ◽  
Wen Wang ◽  
Yiman Wang ◽  
Jia Chen ◽  
Guifen Liu ◽  
...  

Key transcription factors (TFs) play critical roles in zygotic genome activation (ZGA) during early embryogenesis, while genome-wide occupancies of only a few factors have been profiled during ZGA due to the limitation of cell numbers or the lack of high-quality antibodies. Here, we present FitCUT&RUN, a modified CUT&RUN method, in which an Fc fragment of immunoglobulin G is used for tagging, to profile TF occupancy in an antibody-free manner and demonstrate its reliability and robustness using as few as five thousand K562 cells. We applied FitCUT&RUN to zebrafish undergoing embryogenesis to generate reliable occupancy profiles of three known activators of zebrafish ZGA: Nanog, Pou5f3 and Sox19b. By profiling the time-series occupancy of Nanog during zebrafish ZGA, we observed a clear trend toward a gradual increase in Nanog occupancy and found that Nanog occupancy prior to the major phase of ZGA is critical for the activation of a significant proportion of early transcribed genes. Our results further suggested that the sequential binding of Nanog may be controlled by replication timing and the presence of Nanog motifs.


2021 ◽  
Author(s):  
Terence Gall-Duncan ◽  
Nozomu Sato ◽  
Ryan K.C. Yuen ◽  
Christopher E. Pearson

Expansions of gene-specific DNA tandem repeats (TRs), first described in 1991 as a disease-causing mutation in humans, are now known to cause >60 phenotypes, not just disease, and not only in humans. TRs are a common form of genetic variation with biological consequences, observed, so far, in humans, dogs, plants, oysters, and yeast. Repeat diseases show atypical clinical features, genetic anticipation, and multiple and partially penetrant phenotypes among family members. Discovery of disease-causing repeat expansion loci accelerated through technological advances in DNA sequencing and computational analyses. Between 2019 and 2021, 17 new disease-causing TR expansions were reported, totaling 63 TR loci (>69 diseases), with a likelihood of more discoveries, and in more organisms. Recent and historical lessons reveal that properly assessed clinical presentations, coupled with genetic and biological awareness, can guide discovery of disease-causing unstable TRs. We highlight critical but underrecognized aspects of TR mutations. Repeat motifs may not be present in current reference genomes but will be in forthcoming gapless long-read references. Repeat motif size can be a single nucleotide to kilobases/unit. At a given locus, repeat motif sequence purity can vary with consequence. Pathogenic repeats can be “insertions” within nonpathogenic TRs. Expansions, contractions, and somatic length variations of TRs can have clinical/biological consequences. TR instabilities occur in humans and other organisms. TRs can be epigenetically modified and/or chromosomal fragile sites. We discuss the expanding field of disease-associated TR instabilities, highlighting prospects, clinical and genetic clues, tools, and challenges for further discoveries of disease-causing TR instabilities and understanding their biological and pathological impacts—a vista that is about to expand.


2021 ◽  
pp. gr.275579.121
Author(s):  
Daniel P Cooke ◽  
David C Wedge ◽  
Gerton Lunter

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, where genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method - Octopus - that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid datasets created using in silico mixtures of diploid Genome In a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths, but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana datasets.


2021 ◽  
Author(s):  
Carlos Vargas-Chavez ◽  
Neil Michel Longo Pendy ◽  
Sandrine E. Nsango ◽  
Laura Aguilera ◽  
Diego Ayala ◽  
...  

Anophelescoluzzii is one of the primary vectors of human malaria in sub-Saharan Africa. Recently, it has spread into the main cities of Central Africa threatening vector control programs. The adaptation of An. coluzzii to urban environments partly results from an increased tolerance to organic pollution and insecticides. Some of the molecular mechanisms for ecological adaptation are known, but the role of transposable elements (TEs) in the adaptive processes of this species has not been studied yet. As a first step toward assessing the role of TEs in rapid urban adaptation, we sequenced using long reads six An. coluzzii genomes from natural breeding sites in two major Central Africa cities. We de novo annotated TEs in these genomes and in an additional high-quality An. coluzzii genome, and we identified 64 new TE families. TEs were nonrandomly distributed throughout the genome with significant differences in the number of insertions of several superfamilies across the studied genomes. We identified seven putatively active families with insertions near genes with functions related to vectorial capacity, and several TEs that may provide promoter and transcription factor binding sites to insecticide resistance and immune-related genes. Overall, the analysis of multiple high-quality genomes allowed us to generate the most comprehensive TE annotation in this species to date and identify several TE insertions that could potentially impact both genome architecture and the regulation of functionally relevant genes. These results provide a basis for future studies of the impact of TEs on the biology of An. coluzzii.


2021 ◽  
Author(s):  
Hossein Salari ◽  
Marco Di Stefano ◽  
Daniel Jost

Chromosome organization and dynamics are involved in regulating many fundamental processes such as gene transcription and DNA repair. Experiments unveiled that chromatin motion is highly heterogeneous inside cell nuclei, ranging from a liquid-like, mobile state to a gel-like, rigid regime. Using polymer modeling, we investigate how these different physical states and dynamical heterogeneities may emerge from the same structural mechanisms. We found that the formation of topologically associating domains (TADs) is a key driver of chromatin motion heterogeneity. In particular, we showed that the local degree of compaction of the TAD regulates the transition from a weakly compact, fluid state of chromatin to a more compact, gel state exhibiting anomalous diffusion and coherent motion. Our work provides a comprehensive study of chromosome dynamics and a unified view of chromatin motion enabling interpretation of the wide variety of dynamical behaviors observed experimentally across different biological conditions, suggesting that the “liquid” or “solid” state of chromatin are in fact two sides of the same coin.


2021 ◽  
Author(s):  
Siwei Chen ◽  
Yuan Liu ◽  
Yingying Zhang ◽  
Shayne D. Wierbowski ◽  
Steven M. Lipkin ◽  
...  

Rapid accumulation of cancer genomic data has led to the identification of an increasing number of mutational hotspots with uncharacterized significance. Here we present a biologically informed computational framework that characterizes the functional relevance of all 1107 published mutational hotspots identified in approximately 25,000 tumor samples across 41 cancer types in the context of a human 3D interactome network, in which the interface of each interaction is mapped at residue resolution. Hotspots reside in network hub proteins and are enriched on protein interaction interfaces, suggesting that alteration of specific protein–protein interactions is critical for the oncogenicity of many hotspot mutations. Our framework enables, for the first time, systematic identification of specific protein interactions affected by hotspot mutations at the full proteome scale. Furthermore, by constructing a hotspot-affected network that connects all hotspot-affected interactions throughout the whole-human interactome, we uncover genome-wide relationships among hotspots and implicate novel cancer proteins that do not harbor hotspot mutations themselves. Moreover, applying our network-based framework to specific cancer types identifies clinically significant hotspots that can be used for prognosis and therapy targets. Overall, we show that our framework bridges the gap between the statistical significance of mutational hotspots and their biological and clinical significance in human cancers.


2021 ◽  
Author(s):  
Lu Liu ◽  
He Chen ◽  
Cheng Sun ◽  
Jianyun Zhang ◽  
Juncheng Wang ◽  
...  

Genomic-scale somatic copy number alterations in healthy humans are difficult to investigate because of low occurrence rates and the structural variations’ stochastic natures. Using a Tn5-transposase-assisted single-cell whole-genome sequencing method, we sequenced over 20,000 single lymphocytes from 16 individuals. Then, with the scale increased to a few thousand single cells per individual, we found that about 7.5% of the cells had large-size copy number alterations. Trisomy 21 was the most prevalent aneuploid event among all autosomal copy number alterations, whereas monosomy X occurred most frequently in over-30-yr-old females. In the monosomy X single cells from individuals with phased genomes and identified X-inactivation ratios in bulk, the inactive X Chromosomes were lost more often than the active ones.


Sign in / Sign up

Export Citation Format

Share Document