scholarly journals Vanishing refuge? Testing the forest refuge hypothesis in coastal East Africa using genome-wide sequence data for seven amphibians

2018 ◽  
Vol 27 (21) ◽  
pp. 4289-4308 ◽  
Author(s):  
Christopher D. Barratt ◽  
Beryl A. Bwong ◽  
Robert Jehle ◽  
H. Christoph Liedtke ◽  
Peter Nagel ◽  
...  
Nature ◽  
2021 ◽  
Vol 590 (7845) ◽  
pp. 290-299 ◽  
Author(s):  
Daniel Taliun ◽  
◽  
Daniel N. Harris ◽  
Michael D. Kessler ◽  
Jedidiah Carlson ◽  
...  

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Taras K Oleksyk ◽  
Walter W Wolfsberger ◽  
Alexandra M Weber ◽  
Khrystyna Shchubelka ◽  
Olga T Oleksyk ◽  
...  

Abstract Background The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. Results The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. Conclusions Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Pierpaolo Maisano Delser ◽  
Eppie R. Jones ◽  
Anahit Hovhannisyan ◽  
Lara Cassidy ◽  
Ron Pinhasi ◽  
...  

AbstractOver the last few years, genome-wide data for a large number of ancient human samples have been collected. Whilst datasets of captured SNPs have been collated, high coverage shotgun genomes (which are relatively few but allow certain types of analyses not possible with ascertained captured SNPs) have to be reprocessed by individual groups from raw reads. This task is computationally intensive. Here, we release a dataset including 35 whole-genome sequenced samples, previously published and distributed worldwide, together with the genetic pipeline used to process them. The dataset contains 72,041,355 sites called across 19 ancient and 16 modern individuals and includes sequence data from four previously published ancient samples which we sequenced to higher coverage (10–18x). Such a resource will allow researchers to analyse their new samples with the same genetic pipeline and directly compare them to the reference dataset without re-processing published samples. Moreover, this dataset can be easily expanded to increase the sample distribution both across time and space.


Parasitology ◽  
2009 ◽  
Vol 136 (5) ◽  
pp. 469-485 ◽  
Author(s):  
A. S. TAFT ◽  
J. J. VERMEIRE ◽  
J. BERNIER ◽  
S. R. BIRKELAND ◽  
M. J. CIPRIANO ◽  
...  

SUMMARYInfection of the snail,Biomphalaria glabrata, by the free-swimming miracidial stage of the human blood fluke,Schistosoma mansoni, and its subsequent development to the parasitic sporocyst stage is critical to establishment of viable infections and continued human transmission. We performed a genome-wide expression analysis of theS. mansonimiracidia and developing sporocyst using Long Serial Analysis of Gene Expression (LongSAGE). Five cDNA libraries were constructed from miracidia andin vitrocultured 6- and 20-day-old sporocysts maintained in sporocyst medium (SM) or in SM conditioned by previous cultivation with cells of theB. glabrataembryonic (Bge) cell line. We generated 21 440 SAGE tags and mapped 13 381 to theS. mansonigene predictions (v4.0e) either by estimating theoretical 3′ UTR lengths or using existing 3′ EST sequence data. Overall, 432 transcripts were found to be differentially expressed amongst all 5 libraries. In total, 172 tags were differentially expressed between miracidia and 6-day conditioned sporocysts and 152 were differentially expressed between miracidia and 6-day unconditioned sporocysts. In addition, 53 and 45 tags, respectively, were differentially expressed in 6-day and 20-day cultured sporocysts, due to the effects of exposure to Bge cell-conditioned medium.


2021 ◽  
Author(s):  
Thabo Michael Yates ◽  
Antoine Lain ◽  
Jamie Campbell ◽  
T. Ian Simpson ◽  
David R FitzPatrick

There are >2500 different genetically-determined developmental disorders (DD), which, as a group, show very high levels of both locus and allelic heterogeneity. This has led to the wide-spread use of evidence-based filtering of genome-wide sequence data as a diagnostic tool in DD. Determining whether the association of a filtered variant at a specific locus is a plausible explanation of the phenotype in the proband is crucial and commonly requires extensive manual literature review by both clinical scientists and clinicians. Access to a database of weighted clinical features extracted from rigorously curated literature would increase the efficiency of this process and facilitate the development of robust phenotypic similarity metrics. However, given the large and rapidly increasing volume of published information, conventional biocuration approaches are becoming impractical. Here, we present a scalable, automated method for extraction of categorical phenotypic descriptors from full-text literature. Papers identified through literature review were downloaded and parsed using the Cadmus custom retrieval package. Human Phenotype Ontology terms were extracted using MetaMap, with 76-83% precision and 72-81% recall. Mean terms per paper increased from 9 in title + abstract, to 69 using full text. We demonstrate that these literature-derived disease models plausibly reflect true disease expressivity more accurately than gold standard manually-curated models, through comparison with prospectively gathered data from the Deciphering Developmental Disorders study. AUC for ROC curves increased by 5-10% through use of literature-derived models. This work shows that scalable automated literature curation increases performance and adds weight to the need for this strategy to be integrated into informatic variant analysis pipelines.


2019 ◽  
Vol 45 (6) ◽  
pp. 1257-1266 ◽  
Author(s):  
Yang Du ◽  
Yun Yu ◽  
Yang Hu ◽  
Xiao-Wan Li ◽  
Ze-Xu Wei ◽  
...  

Abstract Genetic variants conferring risk for schizophrenia (SCZ) have been extensively studied, but the role of posttranscriptional mechanisms in SCZ is not well studied. Here we performed the first genome-wide microRNA (miRNA) expression profiling in serum-derived exosome from 49 first-episode, drug-free SCZ patients and 46 controls and identified miRNAs and co-regulated modules that were perturbed in SCZ. Putative targets of these SCZ-affected miRNAs were enriched strongly for genes that have been implicated in protein glycosylation and were also related to neurotransmitter receptor and dendrite (spine) development. We validated several differentially expressed blood exosomal miRNAs in 100 SCZ patients as compared with 100 controls by quantitative reverse transcription-polymerase chain reaction. The potential regulatory relationships between several SCZ-affected miRNAs and their putative target genes were also validated. These include hsa-miR-206, which is the most upregulated miRNA in the blood exosomes of SCZ patients and that previously reported to regulate brain-derived neurotrophic factor expression, which we showed reduced mRNA and protein levels in the blood of SCZ patients. In addition, we found 11 miRNAs in blood exosomes from the miRNA sequence data that can be used to classify samples from SCZ patients and control subjects with close to 90% accuracy in the training samples, and approximately 75% accuracy in the testing samples. Our findings support a role for exosomal miRNA dysregulation in SCZ pathophysiology and provide a rich data set and framework for future analyses of miRNAs in the disease, and our data also suggest that blood exosomal miRNAs are promising biomarkers for SCZ.


2020 ◽  
Author(s):  
Zalak Shah ◽  
Myo T Naung ◽  
Kara A Moser ◽  
Matthew Adams ◽  
Andrea G Buchwald ◽  
...  

Individuals acquire immunity to clinical malaria after repeated Plasmodium falciparum infections. This immunity to disease is thought to reflect the acquisition of a repertoire of responses to multiple alleles in diverse parasite antigens. In previous studies, we identified polymorphic sites within individual antigens that are associated with parasite immune evasion by examining antigen allele dynamics in individuals followed longitudinally. Here we expand this approach by analyzing genome-wide polymorphisms using whole genome sequence data from 140 parasite isolates representing malaria cases from a longitudinal study in Malawi and identify 25 genes that encode likely targets of naturally acquired immunity and that should be further characterized for their potential as vaccine candidates.


2015 ◽  
Author(s):  
Jane Hawkey ◽  
Mohammad Hamidian ◽  
Ryan R Wick ◽  
David J Edwards ◽  
Helen Billman-Jacobe ◽  
...  

Background Insertion sequences (IS) are small transposable elements, commonly found in bacterial genomes. Identifying the location of IS in bacterial genomes can be useful for a variety of purposes including epidemiological tracking and predicting antibiotic resistance. However IS are commonly present in multiple copies in a single genome, which complicates genome assembly and the identification of IS insertion sites. Here we present ISMapper, a mapping-based tool for identification of the site and orientation of IS insertions in bacterial genomes, direct from paired-end short read data. Results ISMapper was validated using three types of short read data: (i) simulated reads from a variety of species, (ii) Illumina reads from 5 isolates for which finished genome sequences were available for comparison, and (iii) Illumina reads from 7 Acinetobacter baumannii isolates for which predicted IS locations were tested using PCR. A total of 20 genomes, including 13 species and 32 distinct IS, were used for validation. ISMapper correctly identified 96% of known IS insertions in the analysis of simulated reads, and 98% in real Illumina reads. Subsampling of real Illumina reads to lower depths indicated ISMapper was reliable for average genome-wide read depths >20x. All ISAba1 insertions identified by ISMapper in the A. baumannii genomes were confirmed by PCR. In each A. baumannii genome, ISMapper successfully identified an IS insertion upstream of the ampC beta-lactamase that could explain phenotypic resistance to third-generation cephalosporins. The utility of ISMapper was further demonstrated by profiling genome-wide IS6110 insertions in 138 publicly available Mycobacterium tuberculosis genomes, revealing lineage-specific insertions and multiple insertion hotspots. Conclusions ISMapper provides a rapid and robust method for identifying IS insertion sites direct from short read data, with a high degree of accuracy demonstrated across a wide range of bacteria.


2021 ◽  
Author(s):  
Adam C. Naj ◽  
Ganna Leonenko ◽  
Xueqiu Jian ◽  
Benjamin Grenier-Boley ◽  
Maria Carolina Dalmasso ◽  
...  

Risk for late-onset Alzheimer's disease (LOAD) is driven by multiple loci primarily identified by genome-wide association studies, many of which are common variants with minor allele frequencies (MAF)>0.01. To identify additional common and rare LOAD risk variants, we performed a GWAS on 25,170 LOAD subjects and 41,052 cognitively normal controls in 44 datasets from the International Genomics of Alzheimer's Project (IGAP). Existing genotype data were imputed using the dense, high-resolution Haplotype Reference Consortium (HRC) r1.1 reference panel. Stage 1 associations of P<10-5 were meta-analyzed with the European Alzheimer's Disease Biobank (EADB) (n=20,301 cases; 21,839 controls) (stage 2 combined IGAP and EADB). An expanded meta-analysis was performed using a GWAS of parental AD/dementia history in the UK Biobank (UKBB) (n=35,214 cases; 180,791 controls) (stage 3 combined IGAP, EADB, and UKBB). Common variant (MAF≥0.01) associations were identified for 29 loci in stage 2, including novel genome-wide significant associations at TSPAN14 (P=2.33×10-12), SHARPIN (P=1.56×10-9), and ATF5/SIGLEC11 (P=1.03[mult]10-8), and newly significant associations without using AD proxy cases in MTSS1L/IL34 (P=1.80×10-8), APH1B (P=2.10×10-13), and CLNK (P=2.24×10-10). Rare variant (MAF<0.01) associations with genome-wide significance in stage 2 included multiple variants in APOE and TREM2, and a novel association of a rare variant (rs143080277; MAF=0.0054; P=2.69×10-9) in NCK2, further strengthened with the inclusion of UKBB data in stage 3 (P=7.17×10-13). Single-nucleus sequence data shows that NCK2 is highly expressed in amyloid-responsive microglial cells, suggesting a role in LOAD pathology.


Sign in / Sign up

Export Citation Format

Share Document