scholarly journals The Wide Distribution and Horizontal Transfers of Beta Satellite DNA in Eukaryotes

2019 ◽  
Author(s):  
Yabin Guo ◽  
Jiawen Yang ◽  
Bin Yuan ◽  
Yu Wu ◽  
Meiyu Li ◽  
...  

Abstract Beta satellite DNA (satDNA) sequences, also known as Sau3A sequences, are repeated DNA elements reported in human and primate genomes. Beta satDNAs may play roles in genome stability and chromosome segregation during mitosis. It is previously thought that beta satDNAs originated in old world monkeys and bursted in great apes. However, global and high-throughput studies on beta satDNAs are still absent.Results: In this study, we searched 7,821 genome assemblies of 3,767 eukaryotic species and found that beta satDNAs actually are widely distributed across eukaryotes. The four major branches of eukaryotes, animals, fungi, plants and Harosa/SAR, all have multiple clades containing beta satDNAs. These results were also confirmed by searching whole genome sequencing data (SRA) and PCR assay. Beta satDNA might have originated during the early evolution of eukaryotes. The widely patchy distribution of beta satDNAs across eukaryotes presents a typical scenario of multiple horizontal transfers (HT). In contrast, beta satDNA sequences were found in all the primate clades, Primatomorpha and Euarchonta, indicating an origin in the common ancestor and vertical transfers thereafter. Besides in eukaryotes, beta satDNAs were even found in some archaea and bacteria, which should have been acquired from eukaryotes via HTs.Conclusion:Beta satDNAs widely exist in eukaryotes. The current distribution landscape of beta satDNA is the result of countless HTs. Our study shows for the first time that satellite DNAs can also undergo HT, and will provide new ideas for the future investigations in HT/HGT field.Keywords: Beta satellite DNA, Sau3A sequences, Eukaryotes, Horizontal gene transfer, Primates

2019 ◽  
Author(s):  
Jiawen Yang ◽  
Bin Yuan ◽  
Yu Wu ◽  
Meiyu Li ◽  
Jian Li ◽  
...  

AbstractBeta satellite DNA (satDNA), also known as Sau3A sequences, are repeated DNA sequences reported in human and primate genomes. It is previously thought that beta satDNAs originated in old world monkeys and bursted in great apes. In this study, we searched 7,821 genome assemblies of 3,767 eukaryotic species and found that beta satDNAs are widely distributed across eukaryotes. The four major branches of eukaryotes, animals, fungi, plants and Harosa/SAR, all have multiple clades containing beta satDNAs. These results were also confirmed by searching whole genome sequencing data (SRA) and PCR assay. Beta satDNA sequences were found in all the primate clades, as well as in Dermoptera and Scandentia, indicating that the beta satDNAs in primates might originate in the common ancestor of Primatomorpha or Euarchonta. In contrast, the widely patchy distribution of beta satDNAs across eukaryotes presents a typical scenario of multiple horizontal transfers.One-sentence summaryBeta satDNAs in Opimoda could be result of HT from Diaphoretickes and those in primates might have originated in common ancestor of Primatomorpha.


Genomics ◽  
2020 ◽  
Vol 112 (6) ◽  
pp. 5295-5304
Author(s):  
Jiawen Yang ◽  
Bin Yuan ◽  
Yu Wu ◽  
Meiyu Li ◽  
Jian Li ◽  
...  

GigaScience ◽  
2020 ◽  
Vol 9 (6) ◽  
Author(s):  
Lisa K Johnson ◽  
Ruta Sahasrabudhe ◽  
James Anthony Gill ◽  
Jennifer L Roach ◽  
Lutz Froenicke ◽  
...  

Abstract Background Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. Findings Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30–45× sequence coverage, and the Illumina platform was used to generate 50–160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. Conclusions High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.


Genome ◽  
1996 ◽  
Vol 39 (6) ◽  
pp. 1045-1050 ◽  
Author(s):  
Bärbel Grebenstein ◽  
Oliver Grebenstein ◽  
Wilhelm Sauer ◽  
Vera Hemleben

Distribution, organization, and molecular analysis of four unrelated satellite DNA components in Aveneae species are described. Highly repeated DNA elements were cloned from Helictotrichon convolutum (CON1 and CON2) and Helictotrichon compression (COM1 and COM2). The lengths of the repeat monomers are 365 bp (CON1), 562 bp (CON2), 346 bp (COM1), and 476 bp (COM2). Similar repeats were detected by dot blots, Southern blots, and by DNA sequencing in other species of the genus Helictotrichon, in Aveneae species, and in species of the tribes Andropogoneae and Oryzeae. All four satellite DNAs are differently distributed in the taxonomic groups mentioned above. Remarkably, the longer elements are built up in a complex pattern of either shorter subrepeats arranged in tandem (COM2) or by duplications inserted into an original 369-bp element (CON2). Shorter representatives, 190 bp, similar to CON1 elements occur in Holcus species. In Koeleria species, COM1-related repeats are only 180 bp in length. No similarity was found among the sequences CON2, COM1, and COM2 or with sequences of other repetitive DNA elements of the grasses, but CON1 shows sequence similarity to an A genome specific repetitive DNA of Oryza (rice). Key words : genome evolution, grasses, Poaceae, repetitive DNA, wild oats.


2020 ◽  
Author(s):  
Eric S. Tvedte ◽  
Mark Gasser ◽  
Benjamin C. Sparklin ◽  
Jane Michalski ◽  
Xuechu Zhao ◽  
...  

ABSTRACTBackgroundThe newest generation of DNA sequencing technology is highlighted by the ability to sequence reads hundreds of kilobases in length, and the increased availability of long read data has democratized the genome sequencing and assembly process. PacBio and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. Released in 2019, the PacBio Sequel II platform advertises substantial enhancements over previous PacBio systems.ResultsWe used whole-genome sequencing data produced by two PacBio platforms (Sequel II and RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. Sequel II assemblies had higher contiguity and consensus accuracy relative to other methods, even after accounting for differences in sequencing throughput. ONT RAPID libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assemblies or combined ONT and Sequel II libraries for eukaryotic genome assemblies. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs.ConclusionsThe ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.


2019 ◽  
Author(s):  
Jiawen Yang ◽  
Yiting Zhou ◽  
Guangwei Ma ◽  
Xueyan Zhang ◽  
Yabin Guo

AbstractBeta satellite DNA (satDNA) sequences are repeated DNA elements located in primate centromeres and telomeres, and might play roles in genome stability and chromosome segregation. Beta satDNAs mainly exist in great apes. Previous studies suggested that beta satDNAs may originate in old world monkeys. In this study, we searched both GenBank and SRA database, and identified beta satDNA sequences from the genomic sequences of 22 species. The beta satDNA sequences found in Prosimian, Dermoptera and Scandentia indicated that the origin of beta satDNAs might be as early as 80 MYA. Strikingly, beta satDNA sequences were also found in a number of some species evolutionarily far from primates, including several endoparasites of human and other great apes, which could be the results of multiple horizontal gene transfer (HGT) events. The similar phylogenic profiles between beta satDNAs in the parasite genomes and the human genome indicates that the parasite beta satDNAs have undergone similar concerted evolution and play similar roles as the beta satDNAs in primates.HighlightsThe ever largest scale analysis on beta satDNAs.The origin of beta satDNAs was traced back to ∼80 MYA.Mass existence of beta satDNAs in non-primate species was contributed by multiple HGT events.


Author(s):  
Eric S Tvedte ◽  
Mark Gasser ◽  
Benjamin C Sparklin ◽  
Jane Michalski ◽  
Carl E Hjelmen ◽  
...  

Abstract The newest generation of DNA sequencing technology is highlighted by the ability to generate sequence reads hundreds of kilobases in length. Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. We used whole-genome sequencing data produced by three PacBio protocols (Sequel II CLR, Sequel II HiFi, RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. In both organisms tested, Sequel II assemblies had the highest consensus accuracy, even after accounting for differences in sequencing throughput. ONT and PacBio CLR had the longest reads sequenced compared to PacBio RS II and HiFi, and genome contiguity was highest when assembling these datasets. ONT Rapid Sequencing libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assembly or polishing eukaryotic genome assemblies, and an ONT-Illumina hybrid approach would be more cost-effective for many users. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs. The ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.


Author(s):  
Johanna L. Jones ◽  
Mark A. Corbett ◽  
Elise Yeaman ◽  
Duran Zhao ◽  
Jozef Gecz ◽  
...  

AbstractInherited paediatric cataract is a rare Mendelian disease that results in visual impairment or blindness due to a clouding of the eye’s crystalline lens. Here we report an Australian family with isolated paediatric cataract, which we had previously mapped to Xq24. Linkage at Xq24–25 (LOD = 2.53) was confirmed, and the region refined with a denser marker map. In addition, two autosomal regions with suggestive evidence of linkage were observed. A segregating 127 kb deletion (chrX:g.118373226_118500408del) in the Xq24–25 linkage region was identified from whole-genome sequencing data. This deletion completely removed a commonly deleted long non-coding RNA gene LOC101928336 and truncated the protein coding progesterone receptor membrane component 1 (PGRMC1) gene following exon 1. A literature search revealed a report of two unrelated males with non-syndromic intellectual disability, as well as congenital cataract, who had contiguous gene deletions that accounted for their intellectual disability but also disrupted the PGRMC1 gene. A morpholino-induced pgrmc1 knockdown in a zebrafish model produced significant cataract formation, supporting a role for PGRMC1 in lens development and cataract formation. We hypothesise that the loss of PGRMC1 causes cataract through disrupted PGRMC1-CYP51A1 protein–protein interactions and altered cholesterol biosynthesis. The cause of paediatric cataract in this family is the truncating deletion of PGRMC1, which we report as a novel cataract gene.


Sign in / Sign up

Export Citation Format

Share Document