segmental duplications
Recently Published Documents


TOTAL DOCUMENTS

340
(FIVE YEARS 136)

H-INDEX

44
(FIVE YEARS 8)

2021 ◽  
Author(s):  
David Porubsky ◽  
Wolfram Höps ◽  
Hufsah Ashraf ◽  
PingHsun Hsieh ◽  
Bernardo Rodriguez-Martin ◽  
...  

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1-retrotransposition; 80% of the larger inversions are balanced and affect twice as many base pairs as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or mobile elements. Since this suggests recurrence due to non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7*10-4 per locus and generation. Recurrent inversions exhibit a sex-chromosomal bias, and significantly co-localize to the critical regions of genomic disorders. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes to disease-causing CNVs.


2021 ◽  
Author(s):  
Arun H. Patil ◽  
Marc K. Halushka ◽  
Bastian K. Fromm

The telomere to telomere (T2T) genome project discovered and mapped ~240 million additional base pairs of primarily telomeric and centromeric reads. Much of this sequence was comprised of satellite sequences and large segmental duplications. We evaluated the extent to which human bona fide microRNAs (miRNAs) may be found in additional paralogous genomic loci or if previously undescribed microRNAs are present in these newly sequenced regions of the human genome. New genomic regions of the T2T project spanning ~240 million bp of sequence were obtained and evaluated by blastn for the human miRNAs contained in MirGeneDB2.0 (N=556) and miRBase (N = 1917) along with all species of MirGeneDB2.0 miRNAs (N=10,899). Additionally, bowtie was used to compare unmapped reads from >4,000 primary cell samples to the new T2T sequence. Based on sequence and structure, no bona fide miRNAs were identified. Ninety-seven miRNAs of questionable authenticity (frequently known repeat elements) were identified from the miRBase dataset across the newly described regions of the human genome. These 97 represent only 51 miRNA families due to paralogy of highly similar miRNAs such as 24 members of the hsa-mir-548 family. Altogether, this data strongly supports our having identified widely expressed bona fide miRNAs in the human genome and move us further toward the completion of human miRNA discovery.


2021 ◽  
Author(s):  
Huishi Toh ◽  
Chentao Yang ◽  
Giulio Formenti ◽  
Kalpana Raja ◽  
Lily Yan ◽  
...  

The Nile rat (Avicanthis niloticus) is an important animal model for biomedical research, including the study of diurnal rhythms and type 2 diabetes. Here, we report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3,613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including those that affect genes associated with type 2 diabetes and metabolic dysfunctions. These include 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse. Our findings reflect the exceptional level of genomic detail present in this assembly, which will greatly expand the potential of the Nile rat as a model organism for genetic studies.


2021 ◽  
Author(s):  
Kang-Wook Kim ◽  
Rishi De-Kayne ◽  
Ian J. Gordon ◽  
Kennedy Saitoti Omufwoko ◽  
Dino J. Martins ◽  
...  

ABSTRACTSupergenes maintain adaptive clusters of alleles in the face of genetic mixing. Although usually attributed to inversions, there are few cases in which the specific mechanisms of recombination suppression, and their timing, have been reconstructed in detail. We investigated the origin of the BC supergene, which controls variation in warning colouration in the African Monarch butterfly, Danaus chrysippus. By generating chromosome-scale assemblies for all three alleles, we identified multiple structural differences. Most strikingly, we find that a region of >1 million bp underwent several segmental duplications at least 7.5 million years ago. The resulting duplicated fragments appear to have triggered four inversions in surrounding parts of the chromosome, resulting in stepwise growth of the region of suppressed recombination. Phylogenies for the inversions are incongruent with the species tree, and suggest that structural polymorphisms have persisted for at least 4.1 million years. In addition to the role of duplications in triggering inversions, our results suggest a previously undescribed mechanism of recombination suppression through independent losses of divergent duplicated tracts. Overall, our findings challenge the idea of instantaneous supergene evolution through a single inversion event, instead pointing towards a stepwise process involving a variety of structural changes.


2021 ◽  
Author(s):  
Eleni Adam ◽  
Desh Ranjan ◽  
Harold Riethman

Abstract Background Human subtelomeric DNA regulates the length and stability of adjacent telomeres that are critical for cellular function, and contains many gene/pseudogene families. Large evolutionarily recent segmental duplications and associated structural variation in human subtelomeres has made complete sequencing and assembly of these regions difficult to impossible for many loci, complicating or precluding a wide range of genetic analyses to investigate their function. Results We present a hybrid assembly method, NanoPore Guided REgional Assembly Tool (NPGREAT), which combines Linked-Read data with ultralong nanopore reads spanning subtelomeric segmental duplications to potentially overcome these difficulties. Linked-Read sets identified by matches with 1-copy subtelomere sequence adjacent to segmental duplications are assembled and extended into the segmental duplication regions using Regional Extension of Assemblies using Linked-Reads (REXTAL). Telomere-containing ultralong nanopore reads are then used to provide contiguity and correct orientation for matching REXTAL sequence contigs as well as identification/correction of any misassemblies (associated primarily with tandem repeats). While we focus on subtelomeres, the method is generally applicable to assembly of segmental duplications and other complex genome regions. Our method was tested for a subset of representative subtelomeres with ultralong nanopore read coverage in GM12878. 10X Linked-Read datasets with high depth of coverage and a TELL-seq Linked-Read dataset with lower depth of coverage were each combined with the ultralong nanopore reads from the same genome to provide improved assemblies. Tandem repeat regions of the short-read assemblies, which are especially prone to misassembly due to collapse of matching tandemly repeated reads, were readily identified and properly sized by comparison with the nanopore reads. Conclusion The NPGREAT method resulted in extension of high-quality assemblies into otherwise inaccessible segmental duplication regions near telomeres, enhancing our ability to accurately assemble human subtelomere DNA. This information will enable improved analyses of the structure, function, and evolution of these key regions.


Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1815
Author(s):  
João Ricchio ◽  
Fabiana Uno ◽  
A. Bernardo Carvalho

Y chromosomes play important roles in sex determination and male fertility. In several groups (e.g., mammals) there is strong evidence that they evolved through gene loss from a common X-Y ancestor, but in Drosophila the acquisition of new genes plays a major role. This conclusion came mostly from studies in two species. Here we report the identification of the 22 Y-linked genes in D. willistoni. They all fit the previously observed pattern of autosomal or X-linked testis-specific genes that duplicated to the Y. The ratio of gene gains to gene losses is ~25 in D. willistoni, confirming the prominent role of gene gains in the evolution of Drosophila Y chromosomes. We also found four large segmental duplications (ranging from 62 kb to 303 kb) from autosomal regions to the Y, containing ~58 genes. All but four of these duplicated genes became pseudogenes in the Y or disappeared. In the GK20609 gene the Y-linked copy remained functional, whereas its original autosomal copy degenerated, demonstrating how autosomal genes are transferred to the Y chromosome. Since the segmental duplication that carried GK20609 contained six other testis-specific genes, it seems that chance plays a significant role in the acquisition of new genes by the Drosophila Y chromosome.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yunfei Wen ◽  
Ali Raza ◽  
Wen Chu ◽  
Xiling Zou ◽  
Hongtao Cheng ◽  
...  

TCP proteins are plant-specific transcription factors that have multipurpose roles in plant developmental procedures and stress responses. Therefore, a genome-wide analysis was performed to categorize the TCP genes in the rapeseed genome. In this study, a total of 80 BnTCP genes were identified in the rapeseed genome and grouped into two main classes (PCF and CYC/TB1) according to phylogenetic analysis. The universal evolutionary analysis uncovered that BnTCP genes had experienced segmental duplications and positive selection pressure. Gene structure and conserved motif examination presented that Class I and Class II have diverse intron-exon patterns and motifs numbers. Overall, nine conserved motifs were identified and varied from 2 to 7 in all TCP genes; and some of them were gene-specific. Mainly, Class II (PCF and CYC/TB1) possessed diverse structures compared to Class I. We identified four hormone- and four stress-related responsive cis-elements in the promoter regions. Moreover, 32 bna-miRNAs from 14 families were found to be targeting 21 BnTCPs genes. Gene ontology enrichment analysis presented that the BnTCP genes were primarily related to RNA/DNA binding, metabolic processes, transcriptional regulatory activities, etc. Transcriptome-based tissue-specific expression analysis showed that only a few genes (mainly BnTCP9, BnTCP22, BnTCP25, BnTCP48, BnTCP52, BnTCP60, BnTCP66, and BnTCP74) presented higher expression in root, stem, leaf, flower, seeds, and silique among all tested tissues. Likewise, qRT-PCR-based expression analysis exhibited that BnTCP36, BnTCP39, BnTCP53, BnTCP59, and BnTCP60 showed higher expression at certain time points under various hormones and abiotic stress conditions but not by drought and MeJA. Our results opened the new groundwork for future understanding of the intricate mechanisms of BnTCP in various developmental processes and abiotic stress signaling pathways in rapeseed.


2021 ◽  
Vol 118 (47) ◽  
pp. e2102842118
Author(s):  
Lila Mouakkad-Montoya ◽  
Michael M. Murata ◽  
Arvis Sulovari ◽  
Ryusuke Suzuki ◽  
Beth Osia ◽  
...  

Extrachromosomal circular DNA (eccDNA) originates from linear chromosomal DNA in various human tissues under physiological and disease conditions. The genomic origins of eccDNA have largely been investigated using in vitro–amplified DNA. However, in vitro amplification obscures quantitative information by skewing the total population stoichiometry. In addition, the analyses have focused on eccDNA stemming from single-copy genomic regions, leaving eccDNA from multicopy regions unexamined. To address these issues, we isolated eccDNA without in vitro amplification (naïve small circular DNA, nscDNA) and assessed the populations quantitatively by integrated genomic, molecular, and cytogenetic approaches. nscDNA of up to tens of kilobases were successfully enriched by our approach and were predominantly derived from multicopy genomic regions including segmental duplications (SDs). SDs, which account for 5% of the human genome and are hotspots for copy number variations, were significantly overrepresented in sperm nscDNA, with three times more sequencing reads derived from SDs than from the entire single-copy regions. SDs were also overrepresented in mouse sperm nscDNA, which we estimated to comprise 0.2% of nuclear DNA. Considering that eccDNA can be integrated into chromosomes, germline-derived nscDNA may be a mediator of genome diversity.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12491
Author(s):  
Xianwen Meng ◽  
Jing Liu ◽  
Mingde Zhao

Background Flax (Linum usitatissimum) is an important crop for its seed oil and stem fiber. Really Interesting New Gene (RING) finger genes play essential roles in growth, development, and biotic and abiotic stress responses in plants. However, little is known about these genes in flax. Methods Here, we performed a systematic genome-wide analysis to identify RING finger genes in flax. Results We identified 587 RING domains in 574 proteins and classified them into RING-H2 (292), RING-HCa (181), RING-HCb (23), RING-v (53), RING-C2 (31), RING-D (2), RING-S/T (3), and RING-G (2). These proteins were further divided into 45 groups according to domain organization. These genes were located in 15 chromosomes and clustered into three clades according to their phylogenetic relationships. A total of 312 segmental duplicated gene pairs were inferred from 411 RING finger genes, indicating a major contribution of segmental duplications to the RING finger gene family expansion. The non-synonymous/synonymous substitution ratio of the segmentally duplicated gene pairs was less than 1, suggesting that the gene family was under negative selection since duplication. Further, most RING genes in flax were differentially expressed during seed development or in the shoot apex. This study provides useful information for further functional analysis of RING finger genes in flax and to develop gene-derived molecular markers in flax breeding.


2021 ◽  
Author(s):  
Feyza Yilmaz ◽  
Umamaheswaran Gurusamy ◽  
Trenell Mosley ◽  
Yulia Mostovoy ◽  
Tamim H. Shaikh ◽  
...  

Chromosomal rearrangements that alter the copy number of dosage-sensitive genes can result in genomic disorders, such as the 3q29 deletion syndrome. At the 3q29 region, non-allelic homologous recombination (NAHR) between paralogous copies of segmental duplications (SDs) leads to a recurrent ~1.6 Mbp deletion or duplication, causing neurodevelopmental and psychiatric phenotypes. However, risk factors contributing to NAHR at this locus are not well understood. In this study, we used an optical mapping approach to identify structural variations within the 3q29 interval. We identified 18 novel haplotypes among 161 unaffected individuals and used this information to characterize this region in 18 probands with either the 3q29 deletion or 3q29 duplication syndrome. A significant amount of variation in haplotype prevalence was observed between populations. Within probands, we narrowed down the breakpoints to a ~5 kbp segment within the SD blocks in 89% of the 3q29 deletion and duplication cases studied. Furthermore, all 3q29 deletion and duplication cases could be categorized into one of five distinct classes based on their breakpoints. Contrary to previous findings for other recurrent deletion and duplication loci, there was no evidence for inversions in either parent of the probands mediating the deletion or duplication seen in this syndrome.


Sign in / Sign up

Export Citation Format

Share Document