scholarly journals Frequency Analysis Techniques for Identification of Viral Genetic Data

mBio ◽  
2010 ◽  
Vol 1 (3) ◽  
Author(s):  
Vladimir Trifonov ◽  
Raul Rabadan

ABSTRACT Environmental metagenomic samples and samples obtained as an attempt to identify a pathogen associated with the emergence of a novel infectious disease are important sources of novel microorganisms. The low costs and high throughput of sequencing technologies are expected to allow for the genetic material in those samples to be sequenced and the genomes of the novel microorganisms to be identified by alignment to those in a database of known genomes. Yet, for various biological and technical reasons, such alignment might not always be possible. We investigate a frequency analysis technique which on one hand allows for the identification of genetic material without relying on alignment and on the other hand makes possible the discovery of nonoverlapping contigs from the same organism. The technique is based on obtaining signatures of the genetic data and defining a distance/similarity measure between signatures. More precisely, the signatures of the genetic data are the frequencies of k-mers occurring in them, with k being a natural number. We considered an entropy-based distance between signatures, similar to the Kullback-Leibler distance in information theory, and investigated its ability to categorize negative-sense single-stranded RNA (ssRNA) viral genetic data. Our conclusion is that in this viral context, the technique provides a viable way of discovering genetic relationships without relying on alignment. We envision that our approach will be applicable to other microbial genetic contexts, e.g., other types of viruses, and will be an important tool in the discovery of novel microorganisms. IMPORTANCE Multiple factors contribute to the emergence of novel infectious diseases. Implementation of effective measures against such diseases relies on the rapid identification of novel pathogens. Another important source of novel microorganisms is environmental metagenomic samples. The low costs and high throughput of sequencing technologies provide a method for the identification of novel microorganisms by sequence alignment. There are several obstacles to this method, as follows: our knowledge of biology is biased by an anthropomorphic view, microbial genomic material could be a minuscule fraction of the sample, the sequencing and enrichment technologies can be a source of errors and biases, and finally, microbes have high diversity and high evolutionary rates. As a result, novel microorganisms could have very low genetic similarity to already known genomes, and the identification by alignment could be computationally prohibitive. We investigate a frequency analysis technique which allows for the identification of novel genetic material without relying on alignment.

2021 ◽  
Author(s):  
Tanner Roy Wiegand ◽  
Aidan McVey ◽  
Anna Nemudraia ◽  
Artem Nemudryi ◽  
Blake Wiedenheft

In late December of 2019, high throughput sequencing technologies enabled rapid identification of SARS-CoV-2 as the etiological agent of COVID-19, and global sequencing efforts are now a critical tool for monitoring the ongoing spread and evolution of this virus. Here, we analyze a subset (n=87,032) of all publicly available SARS-CoV-2 genomes (n=~5.6 million) that were randomly selected, but equally distributed over the course of the pandemic. We plot the appearance of new variants of concern (VOCs) over time and show that the mutation rates in Omicron viruses are significantly greater than those in previously identified SARS-CoV-2 variants. Mutations in Omicron are primarily restricted to the spike protein, while 25 other viral proteins—including those involved in SARS-CoV-2 replication—are highly conserved. Collectively, this suggests that the genetic distinction of Omicron primarily arose from selective pressures on the spike, and that the fidelity of replication of this variant has not been altered.


Diagnostics ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 791
Author(s):  
Alba Folgueiras-González ◽  
Robin van den Braak ◽  
Martin Deijs ◽  
Lia van der Hoek ◽  
Ad de Groof

In recent years, refined molecular methods coupled with powerful high throughput sequencing technologies have increased the potential of virus discovery in clinical samples. However, host genetic material remains a complicating factor that interferes with discovery of novel viruses in solid tissue samples as the relative abundance of the virus material is low. Physical enrichment processing methods, although usually complicated, labor-intensive, and costly, have proven to be successful for improving sensitivity of virus detection in complex samples. In order to further increase detectability, we studied the application of fast and simple high-throughput virus enrichment methods on tissue homogenates. Probe sonication in high EDTA concentrations, organic extraction with Vertrel™ XF, or a combination of both, were applied prior to chromatography-like enrichment using Capto™ Core 700 resin, after which effects on virus detection sensitivity by the VIDISCA method were determined. Sonication in the presence of high concentrations of EDTA showed the best performance with an increased proportion of viral reads, up to 9.4 times, yet minimal effect on the host background signal. When this sonication procedure in high EDTA concentrations was followed by organic extraction with Vertrel™ XF and two rounds of core bead chromatography enrichment, an increase up to 10.5 times in the proportion of viral reads in the processed samples was achieved, with reduction of host background sequencing. We present a simple and semi-high-throughput method that can be used to enrich homogenized tissue samples for viral reads.


2006 ◽  
Vol 11 (3) ◽  
pp. 236-246 ◽  
Author(s):  
Laurence H. Lamarcq ◽  
Bradley J. Scherer ◽  
Michael L. Phelan ◽  
Nikolai N. Kalnine ◽  
Yen H. Nguyen ◽  
...  

A method for high-throughput cloning and analysis of short hairpin RNAs (shRNAs) is described. Using this approach, 464 shRNAs against 116 different genes were screened for knockdown efficacy, enabling rapid identification of effective shRNAs against 74 genes. Statistical analysis of the effects of various criteria on the activity of the shRNAs confirmed that some of the rules thought to govern small interfering RNA (siRNA) activity also apply to shRNAs. These include moderate GC content, absence of internal hairpins, and asymmetric thermal stability. However, the authors did not find strong support for positionspecific rules. In addition, analysis of the data suggests that not all genes are equally susceptible to RNAinterference (RNAi).


2011 ◽  
Vol 22 (2) ◽  
pp. 67-74 ◽  
Author(s):  
Malgorzata Sudol ◽  
Jennifer L Fritz ◽  
Melissa Tran ◽  
Gavin P Robertson ◽  
Julie B Ealy ◽  
...  

Background: In addition to activities needed to catalyse integration, retroviral integrases exhibit non-specific endonuclease activity that is enhanced by certain small compounds, suggesting that integrase could be stimulated to damage viral DNA before integration occurs. Methods: A non-radioactive, plate-based, solution phase, fluorescence assay was used to screen a library of 50,080 drug-like chemicals for stimulation of non-specific DNA nicking by HIV-1 integrase. Results: A semi-automated workflow was established and primary hits were readily identified from a graphic output. Overall, 0.6% of the chemicals caused a large increase in fluorescence (the primary hit rate) without also having visible colour that could have artifactually caused this result. None of the potential stimulators from this moderate-size library, however, passed a secondary test that included an inactive integrase mutant that assessed whether the increased fluorescence depended on the endonuclease activity of integrase. Conclusions: This first attempt at identifying integrase stimulator compounds establishes the necessary logistics and workflow required. The results from this study should encourage larger scale high-throughput screening to advance the novel antiviral strategy of stimulating integrase to damage retroviral DNA.


Author(s):  
Yi Zhang ◽  
Tao Wang ◽  
Yan Wang ◽  
Kun Xia ◽  
Jinchen Li ◽  
...  

AbstractNeurodevelopmental disorders (NDDs) are a group of diseases characterized by high heterogeneity and frequently co-occurring symptoms. The mutational spectrum in patients with NDDs is largely incomplete. Here, we sequenced 547 genes from 1102 patients with NDDs and validated 1271 potential functional variants, including 108 de novo variants (DNVs) in 78 autosomal genes and seven inherited hemizygous variants in six X chromosomal genes. Notably, 36 of these 78 genes are the first to be reported in Chinese patients with NDDs. By integrating our genetic data with public data, we prioritized 212 NDD candidate genes with FDR < 0.1, including 17 novel genes. The novel candidate genes interacted or were co-expressed with known candidate genes, forming a functional network involved in known pathways. We highlighted MSL2, which carried two de novo protein-truncating variants (p.L192Vfs*3 and p.S486Ifs*11) and was frequently connected with known candidate genes. This study provides the mutational spectrum of NDDs in China and prioritizes 212 NDD candidate genes for further functional validation and genetic counseling.


Author(s):  
Stella C. Yuan ◽  
Eric Malekos ◽  
Melissa T. R. Hawkins

AbstractThe use of museum specimens held in natural history repositories for population and conservation genetic research is increasing in tandem with the use of massively parallel sequencing technologies. Short Tandem Repeats (STRs), or microsatellite loci, are commonly used genetic markers in wildlife and population genetic studies. However, they traditionally suffered from a host of issues including length homoplasy, high costs, low throughput, and difficulties in reproducibility across laboratories. Massively parallel sequencing technologies can address these problems, but the incorporation of museum specimen derived DNA suffers from significant fragmentation and exogenous DNA contamination. Combatting these issues requires extra measures of stringency in the lab and during data analysis, yet there have not been any high-throughput sequencing studies evaluating microsatellite allelic dropout from museum specimen extracted DNA. In this study, we evaluate genotyping errors derived from mammalian museum skin DNA extracts for previously characterized microsatellites across PCR replicates utilizing high-throughput sequencing. We found it useful to classify samples based on DNA concentration, which determined the rate by which genotypes were accurately recovered. Longer microsatellites performed worse in all museum specimens. Allelic dropout rates across loci were dependent on sample quantity, with high concentration museum specimens performing as well and recovering quality metrics nearly as high as the frozen tissue sample. Based on our results, we provide a set of best practices for quality assurance and incorporation of reliable genotypes from museum specimens.


Viruses ◽  
2019 ◽  
Vol 11 (1) ◽  
pp. 64 ◽  
Author(s):  
Chang Liu ◽  
Wei Cai ◽  
Xin Yin ◽  
Zimin Tang ◽  
Guiping Wen ◽  
...  

Hepatitis E virus (HEV) is a common cause of acute hepatitis worldwide. Current methods for evaluating the neutralizing activity of HEV-specific antibodies include immunofluorescence focus assays (IFAs) and real-time PCR, which are insensitive and operationally complicated. Here, we developed a high-throughput neutralization assay by measuring secreted pORF2 levels using an HEV antigen enzyme-linked immunosorbent assay (ELISA) kit based on the highly replicating HEV genotype (gt) 3 strain Kernow. We evaluated the neutralizing activity of HEV-specific antibodies and the sera of vaccinated individuals (n = 15) by traditional IFA and the novel assay simultaneously. A linear regression analysis shows that there is a high degree of correlation between the two assays. Furthermore, the anti-HEV IgG levels exhibited moderate correlation with the neutralizing titers of the sera of vaccinated individuals, indicating that immunization with gt 1 can protect against gt 3 Kernow infection. We then determined specificity of the novel assay and the potential threshold of neutralizing capacity using anti-HEV IgG positive sera (n = 27) and anti-HEV IgG negative sera (n = 23). The neutralizing capacity of anti-HEV IgG positive sera was significantly stronger than that of anti-HEV IgG negative. In addition, ROC curve analysis shows that the potential threshold of neutralizing capacity of sera was 8.07, and the sensitivity and specificity of the novel assay was 88.6% and 100%, respectively. Our results suggest that the neutralization assay using the antigen ELISA kit could be a useful tool for HEV clinical research.


Author(s):  
Thien Minh Nguyen ◽  
Tien Thi My Pham

The agronomic values of this population have been evaluated in the field experiments based on their phenotypic performance of agronomic traits, but the genetic variability of this population needs to be evaluated via techniques based on genetic material - DNA. In this study, the genetic variability in the investigated population of 71 hybrids and their parents was evaluated by RAPD technique, using eight selected arbitrarily primers; Genetic parameters and dendrogram expressing the genetic relationships among the investigated population were analyzed by GenALEx 6.1, Popgene 1.31 and NTSYSpc 2.1 softwares. Eight primers were used to generate the amplify products on each individual in the investigated population. From 74 genotypes, a total of 109 fragments were generated, among which, there were 89 polymorphic bands representing 81.65% with an average of 11 polymorphic bands/primer. Genetic similarity coefficient among the investigated population, based on DICE coefficient, ranged from 0.560 (LH05/0822 and PB260) to 0.991 (LH05/0781 and LH05/0841) with an average of 0,796, meaning that the genetic distance among ranged from 0.009 to 0.440 with an average of 0.231. The Shannon index and mean heterozygosity values were 0.328 and 0,176, respectively. This indicated that the progenies of the two investigated crosses possessed a relatively high range of genetic variability. The analysis of molecular variance (AMOVA) showed that genetic variation within population represented 62%, while genetic variation among two different crosses contributes 38% to the total genetic variability. Dendrogram based on DICE’s genetic similarity using UPGMA method showed that the hybrids divide into two major genetic groups (0.75), but the crosses were scattered independently of the hybrid.


2020 ◽  
Vol 13 (1) ◽  
pp. 120
Author(s):  
Arisni Kholifatu ◽  
Tengsoe Tjahjono

ABSTRAK Tujuan penelitian ini mendeskripsikan pengaruh tahta tertinggi dan perlawanan kaum subaltern pada novel Arok Dedes karya Pramoedya Ananta Toer dengan menggunakan teori postkolonialisme Gayatri Spivak. Penelitian ini merupakan jenis penelitian deskriptif kualitatif. Pendekatan dalam penelitian ini  mengunakan pendekatan kualitatif karena dalam penelitian ini menggunakan sumber data novel Arok Dedes  yang berkisah tentang kudeta di Tanah Jawa. Data penelitian ini adalah kata, kalimat, paragraf, yang terdapat dalam novel Arok Dedes karya Pramoedya Ananta Toer dengan menggunakan teori poskolonial Ggayatri Spivak. Teknik pengumpulan data dalam penelitian ini menggunakan metode dokumentasi atau pustaka. Teknik analisis data penelitian ini menggunakan teknik analisia deskriptif. Hasil dari penelitian adalah pengaruh tahta tertinggi dan perlawanan kaum subaltern pada novel Arok Dedes karya Pramoedya Ananta Toer.Kata kunci: Subaltern, poskolonial, pengaruh tahta, perlawananABSTRACTThe purpose of this study is to describe the influence of the highest throne and the resistance of the subalterns on the novel Arok Dedes by Pramoedya Ananta Toer by using the postcolonialism theory of Gayatri Spivak. This research is a descriptive qualitative research. The approach in this study using a qualitative approach because in this study used Arok Dedes story novel as data sources  which is about a coup in Java. This research data is words, sentences, paragraphs, contained in Arok Dedes novel by Pramoedya Anan ta Toer by using postcolonial Ggayatri Spivak theory. Data collection techniques in this study used the method of documentation or literature . The data analysis technique of this study used descriptive analysis techniques. The results of the study are the influence of the highest throne and the resistance of the subalterns on the novel Arok Dedes by Pramoedya Ananta Toer. Keywords: Subaltern, postcolonial, influence of throne, resistance


Sign in / Sign up

Export Citation Format

Share Document