scholarly journals The structure, function, and evolution of a complete human chromosome 8

2020 ◽  
Author(s):  
Evan Eichler ◽  
Glennis Logsdon ◽  
Mitchell Vollger ◽  
PingHsun Hsieh ◽  
Yafei Mao ◽  
...  

Abstract The complete assembly of each human chromosome is essential for understanding human biology and evolution. Using complementary long-read sequencing technologies, we complete the first linear assembly of a human autosome, chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08 Mbp centromeric α-satellite array, a 644 kbp defensin copy number polymorphism important for disease risk, and an 863 kbp variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73 kbp hypomethylated region of diverse higher-order α-satellite enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. Using a dual long-read sequencing approach, we complete the assembly of the orthologous chromosome 8 centromeric regions in chimpanzee, orangutan, and macaque for the first time to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved specifically in the great ape ancestor, and the centromeric region evolved with a layered symmetry, with more ancient higher-order repeats located at the periphery adjacent to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated at least 2.2-fold, and this acceleration extends beyond the higher-order α-satellite into the flanking sequence.

Author(s):  
Glennis A. Logsdon ◽  
Mitchell R. Vollger ◽  
PingHsun Hsieh ◽  
Yafei Mao ◽  
Mikhail A. Liskovykh ◽  
...  

ABSTRACTThe complete assembly of each human chromosome is essential for understanding human biology and evolution. Using complementary long-read sequencing technologies, we complete the first linear assembly of a human autosome, chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08 Mbp centromeric α-satellite array, a 644 kbp defensin copy number polymorphism important for disease risk, and an 863 kbp variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73 kbp hypomethylated region of diverse higher-order α-satellite enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. Using a dual long-read sequencing approach, we complete the assembly of the orthologous chromosome 8 centromeric regions in chimpanzee, orangutan, and macaque for the first time to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved specifically in the great ape ancestor, and the centromeric region evolved with a layered symmetry, with more ancient higher-order repeats located at the periphery adjacent to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated at least 2.2-fold, and this acceleration extends beyond the higher-order α-satellite into the flanking sequence.


Nature ◽  
2021 ◽  
Author(s):  
Glennis A. Logsdon ◽  
Mitchell R. Vollger ◽  
PingHsun Hsieh ◽  
Yafei Mao ◽  
Mikhail A. Liskovykh ◽  
...  

AbstractThe complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.


2021 ◽  
pp. 153537022110035
Author(s):  
Jack NG Marshall ◽  
Ana Illera Lopez ◽  
Abigail L Pfaff ◽  
Sulev Koks ◽  
John P Quinn ◽  
...  

Understanding the mechanisms regulating tissue specific and stimulus inducible regulation is at the heart of understanding human biology and how this translates to wellbeing, the ageing process, and disease progression. Polymorphic DNA variation is superimposed as an extra layer of complexity in such processes which underpin our individuality and are the focus of personalized medicine. This review focuses on the role and action of repetitive DNA, specifically variable number tandem repeats and SINE-VNTR- Alu domains, highlighting their role in modification of gene structure and gene expression in addition to their polymorphic nature being a genetic modifier of disease risk and progression. Although the literature focuses on their role in disease, it illustrates their potential to be major contributors to normal physiological function. To date, these elements have been under-reported in genomic analysis due to the difficulties in their characterization with short read DNA sequencing methods. However, recent advances in long read sequencing methods should resolve these problems allowing for a greater understanding of their contribution to a host of genomic and functional mechanisms underlying physiology and disease.


Author(s):  
Cheng Yee Tang ◽  
Rick Twee-Hee Ong

Abstract Summary Mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing is widely used to genotype Mycobacterium tuberculosis complex in epidemiological studies for tracking tuberculosis transmission. Recent long-read sequencing technologies from Pacific Biosciences and Oxford Nanopore Technologies can produce reads that are long enough to cover the entire repeat regions in each MIRU-VNTR locus which was previously not possible using the short reads from Illumina high-throughput sequencing technologies. We thus developed MIRUReader for MIRU-VNTR typing directly from long sequence reads. Availability and implementation Source code and documentation for MIRUReader program is freely available at https://github.com/phglab/MIRUReader. Supplementary information Supplementary data are available at Bioinformatics online.


Pathogens ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 427
Author(s):  
Martyna Kasela ◽  
Agnieszka Grzegorczyk ◽  
Bożena Nowakowicz-Dębek ◽  
Anna Malm

Nursing homes (NH) contribute to the regional spread of methicillin-resistant Staphylococcus aureus (MRSA). Moreover, residents are vulnerable to the colonization and subsequent infection of MRSA etiology. We aimed at investigating the molecular and phenotypic characteristics of 21 MRSA collected from the residents and personnel in an NH (Lublin, Poland) during 2018. All MRSA were screened for 20 genes encoding virulence determinants (sea-see, eta, etb, tst, lukS-F-PV, eno, cna, ebpS, fib, bbp, fnbA, fnbB, icaADBC) and for resistance to 18 antimicrobials. To establish the relatedness and clonal complexes of MRSA in NH we applied multiple-locus variable-number tandem-repeat fingerprinting (MLVF), pulse field gel electrophoresis (PFGE), multilocus sequence typing (MLST) and staphylococcal cassette chromosome mec (SCCmec) typing. We identified four sequence types (ST) among two clonal complexes (CC): ST (CC22) known as EMRSA-15 as well as three novel STs—ST6295 (CC8), ST6293 (CC8) and ST6294. All tested MRSA were negative for sec, eta, etb, lukS-F-PV, bbp and ebpS genes. The most prevalent gene encoding toxin was sed (52.4%; n = 11/21), and adhesins were eno and fnbA (100%). Only 9.5% (n = 2/21) of MRSA were classified as multidrug-resistant. The emergence of novel MRSA with a unique virulence and the presence of epidemic clone EMRSA-15 creates challenges for controlling the spread of MRSA in NH.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Tsung-Yu Lu ◽  
Katherine M. Munson ◽  
Alexandra P. Lewis ◽  
Qihui Zhu ◽  
Luke J. Tallon ◽  
...  

AbstractVariable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
J. Robert Macey ◽  
Stephan Pabinger ◽  
Charles G. Barbieri ◽  
Ella S. Buring ◽  
Vanessa L. Gonzalez ◽  
...  

AbstractAnimal mitochondrial genomic polymorphism occurs as low-level mitochondrial heteroplasmy and deeply divergent co-existing molecules. The latter is rare, known only in bivalvian mollusks. Here we show two deeply divergent co-existing mt-genomes in a vertebrate through genomic sequencing of the Tuatara (Sphenodon punctatus), the sole-representative of an ancient reptilian Order. The two molecules, revealed using a combination of short-read and long-read sequencing technologies, differ by 10.4% nucleotide divergence. A single long-read covers an entire mt-molecule for both strands. Phylogenetic analyses suggest a 7–8 million-year divergence between genomes. Contrary to earlier reports, all 37 genes typical of animal mitochondria, with drastic gene rearrangements, are confirmed for both mt-genomes. Also unique to vertebrates, concerted evolution drives three near-identical putative Control Region non-coding blocks. Evidence of positive selection at sites linked to metabolically important transmembrane regions of encoded proteins suggests these two mt-genomes may confer an adaptive advantage for an unusually cold-tolerant reptile.


Sign in / Sign up

Export Citation Format

Share Document