scholarly journals Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans

2017 ◽  
Vol 8 (1) ◽  
Author(s):  
Ananyo Choudhury ◽  
Michèle Ramsay ◽  
Scott Hazelhurst ◽  
Shaun Aron ◽  
Soraya Bardien ◽  
...  
2021 ◽  
Author(s):  
◽  
Mariah Taylor ◽  

Two RNA virus families that pose a threat to human and animal health are Hantaviridae and Coronaviridae. These RNA viruses which originate in wildlife continue and will continue to cause disease, and hence, it is critical that scientific research define the mechanisms as to how these viruses spillover and adapt to new hosts to become endemic. One gap in our ability to define these mechanisms is the lack of whole genome sequences for many of these viruses. To address this specific gap, I developed a versatile amplicon-based whole-genome sequencing (WGS) approach to identify viral genomes of hantaviruses and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) within reservoir and spillover hosts. In my research studies, I used the amplicon-based WGS approach to define the genetic plasticity of viral RNA within pathogenic and nonpathogenic hantavirus species. The standing genetic variation of Andes orthohantavirus and Prospect Hill orthohantavirus was mapped out and amino acid changes occurring outside of functional domains were identified within the nucleocapsid and glycoprotein. I observed several amino acid changes in functional domains of the RNA-dependent RNA polymerase, as well as single nucleotide polymorphisms (SNPs) within the 3’ non-coding region (NCR) of the S-segment. To identify whether virus adaptation would occur within the S- and L-segments we attempted to adapt hantaviruses in vitro in a spillover host model through passaging experiments. In early passages we identified few mutations in the M-segment with the majority being identified in the S-segment 3’ NCR and the L-segment. This work suggests that hantavirus adaptation occurs in the S- and L-segments although the effect of these mutants on pathology is yet to be determined. While sequencing laboratory isolates is easily accomplished, sequencing low concentrations of virus within the reservoir is a formidable task. I further translated our amplicon-based WGS approach into a pan-oligonucleotide amplicon-based WGS approach to sequence hantavirus vRNA and mRNA from reservoir and spillover hosts in Ukraine. This approach successfully identified a novel Puumala orthohantavirus (PUUV) strain in Ukraine and using Bayesian phylogenetics we found this strain to be associated with the PUUV Latvian lineage. Early during the SARS-CoV-2 pandemic, I applied the knowledge gained in the hantavirus WGS efforts to sequencing of SARS-CoV-2 from nasopharyngeal swabs collected in April 2020. The genetic diversity of 45 SARS-CoV-2 isolates was evaluated with the methods I developed. We identified D614G, a notable mutation known for increasing transmission, in over 90% of our isolates. Two major lineages distinguish SARS-CoV-2 variants worldwide, lineages A and B. While most of our isolates were found within B lineage, we also identified one isolate within lineage A. We performed in vitro work which confirmed A lineage isolates as having poor replication in the trachea as compared to the nasal cavity. Five of these isolates presented a unique array of mutations which were assessed in the keratin 18 human angiotensin-converting enzyme 2 (K18-hACE2) mouse model for its response immunologically and pathogenically. We identified a distinction of pathogenesis between the A and B lineages with emphysema being common amongst A lineage isolates. Additionally, we discovered a small cohort of likely SNPs that defined the late induction of eosinophils during infection. In summary, this work will further define the dynamics of genetic variation and plasticity within virus populations that cause disease outbreaks and will allow a deeper understanding of the virus-host relationship.


2011 ◽  
Vol 43 (8) ◽  
pp. 741-743 ◽  
Author(s):  
Srikanth Gottipati ◽  
Leonardo Arbiza ◽  
Adam Siepel ◽  
Andrew G Clark ◽  
Alon Keinan

2017 ◽  
Author(s):  
Kellie A. Schaefer ◽  
Benjamin W Darbro ◽  
Diana F. Colgan ◽  
Stephen H. Tsang ◽  
Alexander G. Bassuk ◽  
...  

Our previous publication suggested CRISPR-Cas9 editing at the zygotic stage might unexpectedly introduce a multitude of subtle but unintended mutations, an interpretation that not surprisingly raised numerous questions. The key issue is that since parental lines were not available, might the reported variants have been inherited? To expand upon the limited available whole genome data on whether CRISPR-edited mice show more genetic variation, whole-genome sequencing was performed on two other mouse lines that had undergone a CRISPR-editing procedure. Again, parents were not available for either the Capn5 nor Fblim1 CRISPR-edited mouse lines, so strain controls were examined. Additionally, we also include verification of variants detected in the initial mouse line. Taken together, these whole-genome-sequencing-level results support the idea that in specific cases, CRISPR-Cas9 editing can precisely edit the genome at the organismal level and may not introduce numerous, unintended, off-target mutations.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Anne-Katrin Emde ◽  
Amanda Phipps-Green ◽  
Murray Cadzow ◽  
C. Scott Gallagher ◽  
Tanya J. Major ◽  
...  

Abstract Background Historically, geneticists have relied on genotyping arrays and imputation to study human genetic variation. However, an underrepresentation of diverse populations has resulted in arrays that poorly capture global genetic variation, and a lack of reference panels. This has contributed to deepening global health disparities. Whole genome sequencing (WGS) better captures genetic variation but remains prohibitively expensive. Thus, we explored WGS at “mid-pass” 1-7x coverage. Results Here, we developed and benchmarked methods for mid-pass sequencing. When applied to a population without an existing genomic reference panel, 4x mid-pass performed consistently well across ethnicities, with high recall (98%) and precision (97.5%). Conclusion Compared to array data imputed into 1000 Genomes, mid-pass performed better across all metrics and identified novel population-specific variants with potential disease relevance. We hope our work will reduce financial barriers for geneticists from underrepresented populations to characterize their genomes prior to biomedical genetic applications.


Circulation ◽  
2012 ◽  
Vol 125 (suppl_10) ◽  
Author(s):  
John Chambers ◽  
Abtehale Al-Hussaini ◽  
Tsung Tan ◽  
Joban Sehmi ◽  
Mark McCarthy ◽  
...  

Background The genetic architecture and variation of Indian Asians, who represent one quarter of the world's population, has not been described. This represents an important obstacle to the identification of the genetic factors contributing to diseases encountered in Indian Asians. Aim To identify and describe the patterns of genetic variation in Indian Asians. Methods We carried out high-depth whole genome sequencing of 8 Indian Asian men, using paired-end and mate-pair libraries, and Illumina GAII x instruments. We used Stampy, with BWA as a pre-mapper, to align reads to Genome Reference Consortium build 37 of the human genome (GRCh37). We used GATK and SAMtools to call SNPs and indels; accepting genetic variants called by both algorithms as confirmed. Results Mean coverage was 28.4x (range 13.9 to 32.5x); 99.8% of the mappable genome was covered by at least one read in each sample. We found 6,602,840 autosomal SNPs (mean 3,318,386 per person) of which 436,823 (6.6%) are novel (not in dbSNP132 or 1000G June 2011). The majority of novel SNPs were singletons (88% vs 20% for known SNPs). There were 50,585 novel SNPs present at least twice (ie MAF>10%), and 2,174 novel SNPs predicted to affect protein coding. Amongst the novel cSNPs that are identified as pathogenic by SIFT or PolyPhen2, 145 are in genes linked by OMIM to human disease, including obesity ( FTO , UCP1 ), diabetes mellitus ( CDKAL1 , GCGR , HNF1B ), lipid metabolism ( APOB ), renal disease ( NPHP4, PKD1 ), hypertension ( NOS2 ), iron and B vitamin metabolism ( CUBN , TCN2, TF ), and susceptibility to malaria and leprosy ( CR1 , FCGR2A , NOS2, TLR1 ). There were 65,613 novel autosomal indels of which 35,097 are present at least twice, and 2,301 novel deletions >100bp. We found that amongst the novel SNPs and indels discovered, >50% are not in high LD (r 2 ≥0.8) with tagSNPs on available high-density microarrays Conclusions Our results reveal 502,436 new genetic variants amongst Indian Asians, including coding SNPs and indels in genes involved in atherosclerosis, carbohydrate and lipid metabolism, immunity and inflammation. The majority of novel variants are in low LD with standard commercial micro-arrays, indicating that these genome-wide arrays do not capture Indian Asian specific genetic variation. Our findings will inform the design of future studies to identify the genetic factors contributing to cardiovascular disease and other disorders that are more common amongst Indian Asians.


2019 ◽  
Vol 12 (10) ◽  
pp. 1971-1987 ◽  
Author(s):  
Gemma V. Clucas ◽  
R. Nicolas Lou ◽  
Nina O. Therkildsen ◽  
Adrienne I. Kovach

Sign in / Sign up

Export Citation Format

Share Document