scholarly journals Structural variation of the malaria-associated human glycophorin A-B-E region

2019 ◽  
Author(s):  
Sandra Louzada ◽  
Walid Algady ◽  
Eleanor Weyell ◽  
Luciana W. Zuccherato ◽  
Paulina Brajer ◽  
...  

AbstractApproximately 5% of the human genome consists of structural variants, which are enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120kb in length, carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A and glycophorin B are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They act as receptors for invasion of a causative agent of malaria, Plasmodium falciparum. A particular complex structural variant (DUP4) that creates a GYPB/GYPA fusion gene is known to confer resistance to malaria. Many other structural variants exist, and remain poorly characterised. Here, we analyse sequences from 6466 genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection using fibre-FISH and breakpoint mapping. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by unequal cross over events (non-allelic homologous recombination, NAHR) and. by comparing the structural variant breakpoints with recombination hotspot maps, show the importance of a particular meiotic recombination hotspot on structural variant formation in this region.


2018 ◽  
Author(s):  
Walid Algady ◽  
Sandra Louzada ◽  
Danielle Carpenter ◽  
Paulina Brajer ◽  
Anna Färnert ◽  
...  

AbstractGlycophorin A and glycophorin B are red blood cell surface proteins that are both receptors for the parasite Plasmodium falciparum, which is the principal cause of malaria in sub-Saharan Africa. DUP4 is a complex structural genomic variant that carries extra copies of a glycophorin A - glycophorin B fusion gene, and has a dramatic effect on malaria risk by reducing the risk of severe malaria by up to 40%. Using fiber-FISH and Illumina sequencing, we validate the structural arrangement of the glycophorin locus in the DUP4 variant, and reveal somatic variation in copy number of the glycophorin A-glycophorin B fusion gene. By developing a simple, specific, PCR-based assay for DUP4 we show the DUP4 variant reaches a frequency of 13% in a village in south-eastern Tanzania. We genotype a substantial proportion of that village and demonstrate an association of DUP4 genotype with hemoglobin levels, a phenotype related to malaria, using a family-based association test. Taken together, we show that DUP4 is a complex structural variant that may be susceptible to somatic variation, and show that it is associated with a malarial-related phenotype in a non-hospitalized population.Significance statementPrevious work has identified a human complex genomic structural variant called DUP4, which includes two novel glycophorin A-glycophorin B fusion genes, is associated with a profound protection against severe malaria. In this study, we present data showing the molecular basis of this complex variant. We also show evidence of somatic variation in the copy number of the fusion genes. We develop a simple robust assay for this variant and demonstrate that DUP4 is at an appreciable population frequency in Tanzania and that it is associated with higher hemoglobin levels in a malaria-endemic village. We suggest that DUP4 is therefore protective against malarial anemia.



2021 ◽  
Vol 12 ◽  
Author(s):  
Junfu Guo ◽  
Chang Shi ◽  
Xi Chen ◽  
Ou Wang ◽  
Ping Liu ◽  
...  

Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness.



BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Sandra Louzada ◽  
Walid Algady ◽  
Eleanor Weyell ◽  
Luciana W. Zuccherato ◽  
Paulina Brajer ◽  
...  


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel L. Cameron ◽  
Jonathan Baber ◽  
Charles Shale ◽  
Jose Espejo Valle-Inclan ◽  
Nicolle Besselink ◽  
...  

AbstractGRIDSS2 is the first structural variant caller to explicitly report single breakends—breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32–100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.



2017 ◽  
Author(s):  
Joseph G. Arthur ◽  
Xi Chen ◽  
Bo Zhou ◽  
Alexander E. Urban ◽  
Wing Hung Wong

AbstractDetecting structural variants (SVs) from sequencing data is key to genome analysis, but methods using standard whole-genome sequencing (WGS) data are typically incapable of resolving complex SVs with multiple co-located breakpoints. We introduce the ARC-SV method, which uses a probabilistic model to detect arbitrary local rearrangements from WGS data. Our method performs well on simple SVs while surpassing state-of-the-art methods in complex SV detection.



2018 ◽  
Vol 19 (S20) ◽  
Author(s):  
Zachary Stephens ◽  
Chen Wang ◽  
Ravishankar K. Iyer ◽  
Jean-Pierre Kocher


Author(s):  
B Meier ◽  
NV Volkova ◽  
Y Hong ◽  
S Bertolini ◽  
V González-Huici ◽  
...  

AbstractGenome integrity is particularly important in germ cells to faithfully preserve genetic information across generations. As yet little is known about the contribution of various DNA repair pathways to prevent mutagenesis. Using the C. elegans model we analyse mutational spectra that arise in wild-type and 61 DNA repair and DNA damage response mutants cultivated over multiple generations. Overall, 44% of lines show >2-fold increased mutagenesis with a broad spectrum of mutational outcomes including changes in single or multiple types of base substitutions induced by defects in base excision or nucleotide excision repair, or elevated levels of 50-400 bp deletions in translesion polymerase mutants rev-3(pol ζ) and polh-1(pol η). Mutational signatures associated with defective homologous recombination fall into two classes: 1) mutants lacking brc-1/BRCA1 or rad-51/RAD51 paralogs show elevated base substitutions, indels and structural variants, while 2) deficiency for MUS-81/MUS81 and SLX-1/SLX1 nucleases, and HIM-6/BLM, HELQ-1/HELQ and RTEL-1/RTEL1 helicases primarily cause structural variants. Genome-wide investigation of mutagenesis patterns identified elevated rates of tandem duplications often associated with inverted repeats in helq-1 mutants, and a unique pattern of ‘translocation’ events involving homeologous sequences in rip-1 paralog mutants. atm-1/ATM DNA damage checkpoint mutants harboured complex structural variants enriched in subtelomeric regions, and chromosome end-to-end fusions. Finally, while inactivation of the p53-like gene cep-1 did not affect mutagenesis, combined brc-1 cep-1 deficiency displayed increased, locally clustered mutagenesis. In summary, we provide a global view of how DNA repair pathways prevent germ cell mutagenesis.



2017 ◽  
Author(s):  
Mircea Cretu Stancu ◽  
Markus J. van Roosmalen ◽  
Ivo Renkens ◽  
Marleen Nieboer ◽  
Sjors Middelkamp ◽  
...  

AbstractStructural genomic variants form a common type of genetic alteration underlying human genetic disease and phenotypic variation. Despite major improvements in genome sequencing technology and data analysis, the detection of structural variants still poses challenges, particularly when variants are of high complexity. Emerging long-read single-molecule sequencing technologies provide new opportunities for detection of structural variants. Here, we demonstrate sequencing of the genomes of two patients with congenital abnormalities using the ONT MinION at 11x and 16x mean coverage, respectively. We developed a bioinformatic pipeline - NanoSV - to efficiently map genomic structural variants (SVs) from the long-read data. We demonstrate that the nanopore data are superior to corresponding short-read data with regard to detection of de novo rearrangements originating from complex chromothripsis events in the patients. Additionally, genome-wide surveillance of SVs, revealed 3,253 (33%) novel variants that were missed in short-read data of the same sample, the majority of which are duplications < 200bp in size. Long sequencing reads enabled efficient phasing of genetic variations, allowing the construction of genome-wide maps of phased SVs and SNVs. We employed read-based phasing to show that all de novo chromothripsis breakpoints occurred on paternal chromosomes and we resolved the long-range structure of the chromothripsis. This work demonstrates the value of long-read sequencing for screening whole genomes of patients for complex structural variants.



2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Alicia C. Bertolotti ◽  
Ryan M. Layer ◽  
Manu Kumar Gundappa ◽  
Michael D. Gallagher ◽  
Ege Pehlivanoglu ◽  
...  

Abstract Structural variants (SVs) are a major source of genetic and phenotypic variation, but remain challenging to accurately type and are hence poorly characterized in most species. We present an approach for reliable SV discovery in non-model species using whole genome sequencing and report 15,483 high-confidence SVs in 492 Atlantic salmon (Salmo salar L.) sampled from a broad phylogeographic distribution. These SVs recover population genetic structure with high resolution, include an active DNA transposon, widely affect functional features, and overlap more duplicated genes retained from an ancestral salmonid autotetraploidization event than expected. Changes in SV allele frequency between wild and farmed fish indicate polygenic selection on behavioural traits during domestication, targeting brain-expressed synaptic networks linked to neurological disorders in humans. This study offers novel insights into the role of SVs in genome evolution and the genetic architecture of domestication traits, along with resources supporting reliable SV discovery in non-model species.



Sign in / Sign up

Export Citation Format

Share Document