scholarly journals Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population

Genes ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 352 ◽  
Author(s):  
Karen H. Miga

The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.

2019 ◽  
Author(s):  
Karen H. Miga ◽  
Sergey Koren ◽  
Arang Rhie ◽  
Mitchell R. Vollger ◽  
Ariel Gershman ◽  
...  

After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 2, along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3, we reconstructed the ∼2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes.


2021 ◽  
Vol 22 (9) ◽  
pp. 4309
Author(s):  
Jitendra Thakur ◽  
Jenika Packiaraj ◽  
Steven Henikoff

Satellite DNA consists of abundant tandem repeats that play important roles in cellular processes, including chromosome segregation, genome organization and chromosome end protection. Most satellite DNA repeat units are either of nucleosomal length or 5–10 bp long and occupy centromeric, pericentromeric or telomeric regions. Due to high repetitiveness, satellite DNA sequences have largely been absent from genome assemblies. Although few conserved satellite-specific sequence motifs have been identified, DNA curvature, dyad symmetries and inverted repeats are features of various satellite DNAs in several organisms. Satellite DNA sequences are either embedded in highly compact gene-poor heterochromatin or specialized chromatin that is distinct from euchromatin. Nevertheless, some satellite DNAs are transcribed into non-coding RNAs that may play important roles in satellite DNA function. Intriguingly, satellite DNAs are among the most rapidly evolving genomic elements, such that a large fraction is species-specific in most organisms. Here we describe the different classes of satellite DNA sequences, their satellite-specific chromatin features, and how these features may contribute to satellite DNA biology and evolution. We also discuss how the evolution of functional satellite DNA classes may contribute to speciation in plants and animals.


Genome ◽  
1997 ◽  
Vol 40 (5) ◽  
pp. 774-781 ◽  
Author(s):  
Daniela Ester Cardone ◽  
Marcello Marotta ◽  
Claudia Rosati ◽  
Gianni Chinali ◽  
Isidoro Feliciello

Digestion of Rana graeca italica DNA with Asp718I produces highly repetitive fragments of 281 and 385 bp that were cloned and sequenced. The shorter fragment corresponds to the unit repeat (RgiS1b) of a satellite DNA. The longer fragment was found to be part of a 494-bp repeat of another satellite DNA (RgiS1a) that was cloned intact as an EcoRV fragment. RgiS1b is 97% homologous to RgiS1a, from which it seems to be derived by a single deletion. Among all species tested, only the related brown frog Rana dalmatina contained homologous repetitive DNA. The overall number of RgiS1a and RgiS1b repeats per R. graeca italica haploid genome was estimated to be 2.7 × 105. RgiS1a and RgiS1b repeats are organized in separate arrays, but repetitive units formed by various combinations of the two repeats were also observed on Southern blots. The amount of these extra repeats varies greatly among animals from the same population, representing a rare case of individual variability in the satellite DNA organization. FISH with probes specific for both satellites, or for RgiS1a only, labeled the centromeric and pericentromeric heterochromatin of all chromosomes. This indicated that RgiS1a and RgiS1b are interspersed within the same heterochromatic regions of the chromosomes.Key words: satellite DNA, nucleotide sequence analysis, tandem repeats organization, amphibian chromosomes.


2021 ◽  
Vol 22 (9) ◽  
pp. 4707
Author(s):  
Mariana Lopes ◽  
Sandra Louzada ◽  
Margarida Gama-Carvalho ◽  
Raquel Chaves

(Peri)centromeric repetitive sequences and, more specifically, satellite DNA (satDNA) sequences, constitute a major human genomic component. SatDNA sequences can vary on a large number of features, including nucleotide composition, complexity, and abundance. Several satDNA families have been identified and characterized in the human genome through time, albeit at different speeds. Human satDNA families present a high degree of sub-variability, leading to the definition of various subfamilies with different organization and clustered localization. Evolution of satDNA analysis has enabled the progressive characterization of satDNA features. Despite recent advances in the sequencing of centromeric arrays, comprehensive genomic studies to assess their variability are still required to provide accurate and proportional representation of satDNA (peri)centromeric/acrocentric short arm sequences. Approaches combining multiple techniques have been successfully applied and seem to be the path to follow for generating integrated knowledge in the promising field of human satDNA biology.


2002 ◽  
Vol 77 (8) ◽  
pp. 773-782 ◽  
Author(s):  
Cindy Pham Lorentz ◽  
Eric D. Wieben ◽  
Ayalew Tefferi ◽  
David A.H. Whiteman ◽  
Gordon W. Dewald

Genome ◽  
1998 ◽  
Vol 41 (2) ◽  
pp. 148-153 ◽  
Author(s):  
Monique Abadon ◽  
Eric Grenier ◽  
Christian Laumond ◽  
Pierre Abad

An AluI satellite DNA family has been cloned from the entomopathogenic nematode Heterorhabditis indicus. This repeated sequence appears to be an unusually abundant satellite DNA, since it constitutes about 45% of the H. indicus genome. The consensus sequence is 174 nucleotides long and has an A + T content of 56%, with the presence of direct and inverted repeat clusters. DNA sequence data reveal that monomers are quite homogeneous. Such homogeneity suggests that some mechanism is acting to maintain the homogeneity of this satellite DNA, despite its abundance, or that this repeated sequence could have appeared recently in the genome of H. indicus. Hybridization analysis of genomic DNAs from different Heterorhabditis species shows that this satellite DNA sequence is specific to the H. indicus genome. Considering the species specificity and the high copy number of this AluI satellite DNA sequence, it could provide a rapid and powerful tool for identifying H. indicus strains.Key words: AluI repeated DNA, tandem repeats, species-specific sequence, nucleotide sequence analysis.


2021 ◽  
Vol 18 ◽  
Author(s):  
Xinyan Liang ◽  
Haijian Wu ◽  
Mark Colt ◽  
Xinying Guo ◽  
Brock Pluimer ◽  
...  

: Alzheimer’s Disease (AD) is the most prevalent form of dementia across the world. While its discovery and pathological manifestations are centered on protein aggregations of amyloid-beta (Aβ) and hyperphosphorylated tau protein, neuroinflammation has emerged in the last decade as a main component of the disease in both pathogenesis and progression. As the main innate immune cell type in central nervous system (CNS), microglia play a very important role in regulating neuroinflammation, which occurs commonly in neurodegenerative conditions including AD. Under inflammatory response, microglia undergo morphological changes and status transition from homeostatic to activated forms. Different microglia subtypes displaying distinct genetic profiles have been identified in AD, and these signatures often link to AD risk genes identified from the genome-wide association studies (GWAS), such as APOE and TREM2. Furthermore, many of AD risk genes are highly enriched in microglia and specifically influence the functions of microglia in pathogenesis, e.g. releasing inflammatory cytokines and clearing Aβ. Therefore, building up a landscape of these risk genes in microglia, based on current preclinical studies and in the context of their pathogenic or protective effects, would largely help us to understand the complexed etiology of AD and provide new insight for the unmet need of effective treatment.


2004 ◽  
Vol 36 (5) ◽  
pp. 512-517 ◽  
Author(s):  
Jonathan Marchini ◽  
Lon R Cardon ◽  
Michael S Phillips ◽  
Peter Donnelly

Genetics ◽  
1994 ◽  
Vol 136 (1) ◽  
pp. 333-341
Author(s):  
W Stephan ◽  
S Cho

Abstract A simulation model of sequence-dependent amplification, unequal crossing over and mutation is analyzed. This model predicts the spontaneous formation of tandem-repetitive patterns of noncoding DNA from arbitrary sequences for a wide range of parameter values. Natural selection is found to play an essential role in this self-organizing process. Natural selection which is modeled as a mechanism for controlling the length of a nucleotide string but not the sequence itself favors the formation of tandem-repetitive structures. Two measures of sequence heterogeneity, inter-repeat variability and repeat length, are analyzed in detail. For fixed mutation rate, both inter-repeat variability and repeat length are found to increase with decreasing rates of (unequal) crossing over. The results are compared with data on micro-, mini- and satellite DNAs. The properties of minisatellites and satellite DNAs resemble the simulated structures very closely. This suggests that unequal crossing over is a dominant long-range ordering force which keeps these arrays homogeneous even in regions of very low recombination rates, such as at satellite DNA loci. Our analysis also indicates that in regions of low rates of (unequal) crossing over, inter-repeat variability is maintained at a low level at the expense of much larger repeat units (multimeric repeats), which are characteristic of satellite DNA. In contrast, the microsatellite data do not fit the proposed model well, suggesting that unequal crossing over does not act on these very short tandem arrays.


Sign in / Sign up

Export Citation Format

Share Document