Evidence for Transcriptome-wide RNA Editing Among Sus scrofa PRE-1 SINE Elements

Mapping Intimacies ◽

10.1101/096321 ◽

2017 ◽

Author(s):

Scott A. Funkhouser ◽

Juan P. Steibel ◽

Ronald O. Bates ◽

Nancy E. Raney ◽

Darius Schenk ◽

...

Keyword(s):

Genetic Variation ◽

Transcriptional Regulation ◽

Rna Editing ◽

High Throughput ◽

Sus Scrofa ◽

Sequence Data ◽

Rna Sequences ◽

Editing Event ◽

Coding Regions ◽

Dna And Rna

AbstractBackgroundRNA editing by ADAR (adenosine deaminase acting on RNA) proteins is a form of transcriptional regulation that is widespread among humans and other primates. Based on high-throughput scans used to identify putative RNA editing sites, ADAR appears to catalyze a substantial number of adenosine to inosine transitions within repetitive regions of the primate transcriptome, thereby dramatically enhancing genetic variation beyond what is encoded in the genome.ResultsHere, we demonstrate the editing potential of the pig transcriptome by utilizing DNA and RNA sequence data from the same pig. We identified a total of 8550 mismatches between DNA and RNA sequences across three tissues, with 75% of these exhibiting an A-to-G (DNA to RNA) discrepancy, indicative of a canonical ADAR-catalyzed RNA editing event. When we consider only mismatches within repetitive regions of the genome, the A-to-G percentage increases to 94%, with the majority of these located within the swine specific SINE retrotransposon PRE-1. We also observe evidence of A-to-G editing within coding regions that were previously verified in primates.ConclusionsThus, our high-throughput evidence suggests that pervasive RNA editing by ADAR can exist outside of the primate lineage to dramatically enhance genetic variation in pigs.

Download Full-text

Genome-Wide Investigation and Functional Analysis of Sus scrofa RNA Editing Sites across Eleven Tissues

Genes ◽

10.3390/genes10050327 ◽

2019 ◽

Vol 10 (5) ◽

pp. 327 ◽

Cited By ~ 4

Author(s):

Zishuai Wang ◽

Xikang Feng ◽

Zhonglin Tang ◽

Shuai Cheng Li

Keyword(s):

Rna Editing ◽

Sus Scrofa ◽

Sequencing Data ◽

Coding Regions ◽

Genome Wide ◽

Model Animal ◽

High Enrichment ◽

The Impact ◽

Sequence Characteristics ◽

Editing Level

Recently, the prevalence and importance of RNA editing have been illuminated in mammals. However, studies on RNA editing of pigs, a widely used biomedical model animal, are rare. Here we collected RNA sequencing data across 11 tissues and identified more than 490,000 RNA editing sites. We annotated their biological features, detected flank sequence characteristics of A-to-I editing sites and the impact of A-to-I editing on miRNA–mRNA interactions, and identified RNA editing quantitative trait loci (edQTL). Sus scrofa RNA editing sites showed high enrichment in repetitive regions with a median editing level as 15.38%. Expectedly, 96.3% of the editing sites located in non-coding regions including intron, 3′ UTRs, intergenic, and gene proximal regions. There were 2233 editing sites located in the coding regions and 980 of them caused missense mutation. Our results indicated that to an A-to-I editing site, the adjacent four nucleotides, two before it and two after it, have a high impact on the editing occurrences. A commonly observed editing motif is CCAGG. We found that 4552 A-to-I RNA editing sites could disturb the original binding efficiencies of miRNAs and 4176 A-to-I RNA editing sites created new potential miRNA target sites. In addition, we performed edQTL analysis and found that 1134 edQTLs that significantly affected the editing levels of 137 RNA editing sites. Finally, we constructed PRESDB, the first pig RNA editing sites database. The site provides necessary functions associated with Sus scrofa RNA editing study.

Download Full-text

A comparison of DNA/RNA extraction protocols for high-throughput sequencing of microbial communities

10.1101/2020.11.13.370387 ◽

2020 ◽

Author(s):

Justin P. Shaffer ◽

Clarisse Marotz ◽

Pedro Belda-Ferre ◽

Cameron Martino ◽

Stephen Wandro ◽

...

Keyword(s):

Microbial Community ◽

High Throughput ◽

High Throughput Sequencing ◽

Limit Of Detection ◽

Sequence Data ◽

Rna Virus ◽

Microbial Community Composition ◽

Rna Extraction ◽

Acid Extraction ◽

Dna And Rna

AbstractOne goal among microbial ecology researchers is to capture the maximum amount of information from all organisms in a sample. The recent COVID-19 pandemic, caused by the RNA virus SARS-CoV-2, has highlighted a gap in traditional DNA-based protocols, including the high-throughput methods we previously established as field standards. To enable simultaneous SARS-CoV-2 and microbial community profiling, we compare the relative performance of two total nucleic acid extraction protocols and our previously benchmarked protocol. We included a diverse panel of environmental and host-associated sample types, including body sites commonly swabbed for COVID-19 testing. Here we present results comparing the cost, processing time, DNA and RNA yield, microbial community composition, limit of detection, and well-to-well contamination, between these protocols.Accession numbersRaw sequence data were deposited at the European Nucleotide Archive (accession#: ERP124610) and raw and processed data are available at Qiita (Study ID: 12201). All processing and analysis code is available on GitHub (github.com/justinshaffer/Extraction_test_MagMAX).Methods summaryTo allow for downstream applications involving RNA-based organisms such as SARS-CoV-2, we compared the two extraction protocols designed to extract DNA and RNA against our previously established protocol for extracting only DNA for microbial community analyses. Across 10 diverse sample types, one of the two protocols was equivalent or better than our established DNA-based protocol. Our conclusion is based on per-sample comparisons of DNA and RNA yield, the number of quality sequences generated, microbial community alpha- and beta-diversity and taxonomic composition, the limit of detection, and extent of well-to-well contamination.

Download Full-text

seqCAT: a Bioconductor R-package for variant analysis of high throughput sequencing data

F1000Research ◽

10.12688/f1000research.16083.1 ◽

2018 ◽

Vol 7 ◽

pp. 1466 ◽

Cited By ~ 2

Author(s):

Erik Fasterius ◽

Cristina Al-Khalili Szigyarto

Keyword(s):

Genetic Variation ◽

Liver Cancer ◽

High Throughput ◽

High Throughput Sequencing ◽

R Package ◽

Ease Of Use ◽

Sequencing Data ◽

Dna And Rna ◽

High Throughput Sequencing Data ◽

Wide Range

High throughput sequencing technologies are flourishing in the biological sciences, enabling unprecedented insights into e.g. genetic variation, but require extensive bioinformatic expertise for the analysis. There is thus a need for simple yet effective software that can analyse both existing and novel data, providing interpretable biological results with little bioinformatic prowess. We present seqCAT, a Bioconductor toolkit for analysing genetic variation in high throughput sequencing data. It is a highly accessible, easy-to-use and well-documented R-package that enables a wide range of researchers to analyse their own and publicly available data, providing biologically relevant conclusions and publication-ready figures. SeqCAT can provide information regarding genetic similarities between an arbitrary number of samples, validate specific variants as well as define functionally similar variant groups for further downstream analyses. Its ease of use, installation, complete data-to-conclusions functionality and the inherent flexibility of the R programming language make seqCAT a powerful tool for variant analyses compared to already existing solutions. A publicly available dataset of liver cancer-derived organoids is analysed herein using the seqCAT package, demonstrating that the organoids are genetically stable. A previously known liver cancer-related mutation is additionally shown to be present in a sample though it was not listed in the original publication. Differences between DNA- and RNA-based variant calls in this dataset are also analysed revealing a high median concordance of 97.5%.

Download Full-text

Genome-wide identification of RNA editing in seven porcine tissues by matched DNA and RNA high-throughput sequencing

Journal of Animal Science and Biotechnology ◽

10.1186/s40104-019-0326-9 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 5

Author(s):

Yuebo Zhang ◽

Longchao Zhang ◽

Jingwei Yue ◽

Xia Wei ◽

Ligang Wang ◽

...

Keyword(s):

Rna Editing ◽

High Throughput ◽

High Throughput Sequencing ◽

Dna And Rna ◽

Genome Wide ◽

Porcine Tissues

Download Full-text

GASAL2: a GPU accelerated sequence alignment library for high-throughput NGS data

BMC Bioinformatics ◽

10.1186/s12859-019-3086-9 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 5

Author(s):

Nauman Ahmed ◽

Jonathan Lévy ◽

Shanshan Ren ◽

Hamid Mushtaq ◽

Koen Bertels ◽

...

Keyword(s):

Sequence Alignment ◽

High Throughput ◽

High Performance ◽

Local Alignment ◽

Global Alignment ◽

Pairwise Sequence Alignment ◽

Rna Sequences ◽

Dna And Rna ◽

Alignment Algorithms ◽

Ngs Data

Abstract Background Due the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis. NVBIO is the only available GPU library that accelerates sequence alignment of high-throughput NGS data, but has limited performance. In this article we present GASAL2, a GPU library for aligning DNA and RNA sequences that outperforms existing CPU and GPU libraries. Results The GASAL2 library provides specialized, accelerated kernels for local, global and all types of semi-global alignment. Pairwise sequence alignment can be performed with and without traceback. GASAL2 outperforms the fastest CPU-optimized SIMD implementations such as SeqAn and Parasail, as well as NVIDIA’s own GPU-based library known as NVBIO. GASAL2 is unique in performing sequence packing on GPU, which is up to 750x faster than NVBIO. Overall on Geforce GTX 1080 Ti GPU, GASAL2 is up to 21x faster than Parasail on a dual socket hyper-threaded Intel Xeon system with 28 cores and up to 13x faster than NVBIO with a query length of up to 300 bases and 100 bases, respectively. GASAL2 alignment functions are asynchronous/non-blocking and allow full overlap of CPU and GPU execution. The paper shows how to use GASAL2 to accelerate BWA-MEM, speeding up the local alignment by 20x, which gives an overall application speedup of 1.3x vs. CPU with up to 12 threads. Conclusions The library provides high performance APIs for local, global and semi-global alignment that can be easily integrated into various bioinformatics tools.

Download Full-text

seqCAT: a Bioconductor R-package for variant analysis of high throughput sequencing data

F1000Research ◽

10.12688/f1000research.16083.2 ◽

2019 ◽

Vol 7 ◽

pp. 1466 ◽

Cited By ~ 1

Author(s):

Erik Fasterius ◽

Cristina Al-Khalili Szigyarto

Keyword(s):

Genetic Variation ◽

Liver Cancer ◽

High Throughput ◽

High Throughput Sequencing ◽

R Package ◽

Ease Of Use ◽

Sequencing Data ◽

Dna And Rna ◽

High Throughput Sequencing Data ◽

Wide Range

High throughput sequencing technologies are flourishing in the biological sciences, enabling unprecedented insights into e.g. genetic variation, but require extensive bioinformatic expertise for the analysis. There is thus a need for simple yet effective software that can analyse both existing and novel data, providing interpretable biological results with little bioinformatic prowess. We present seqCAT, a Bioconductor toolkit for analysing genetic variation in high throughput sequencing data. It is a highly accessible, easy-to-use and well-documented R-package that enables a wide range of researchers to analyse their own and publicly available data, providing biologically relevant conclusions and publication-ready figures. SeqCAT can provide information regarding genetic similarities between an arbitrary number of samples, validate specific variants as well as define functionally similar variant groups for further downstream analyses. Its ease of use, installation, complete data-to-conclusions functionality and the inherent flexibility of the R programming language make seqCAT a powerful tool for variant analyses compared to already existing solutions. A publicly available dataset of liver cancer-derived organoids is analysed herein using the seqCAT package, corroborating the original authors' conclusions that the organoids are genetically stable. A previously known liver cancer-related mutation is additionally shown to be present in a sample though it was not listed in the original publication. Differences between DNA- and RNA-based variant calls in this dataset are also analysed revealing a high median concordance of 97.5%. SeqCAT is an open source software under a MIT licence available at https://bioconductor.org/packages/release/bioc/html/seqCAT.html.

Download Full-text

High-resolution nucleic acid sequence mapping via in situ hybridization at the Electron Microscope level

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100140130 ◽

1995 ◽

Vol 53 ◽

pp. 752-753

Author(s):

B.A. Hamkalo ◽

S. Narayanswami ◽

A.P. Kausch

Keyword(s):

Nucleic Acid ◽

Interphase Nucleus ◽

Genome Mapping ◽

Precise Location ◽

Electron Microscope Level ◽

Rna Sequences ◽

Fundamental Interest ◽

Whole Cells ◽

Dna And Rna

The availability of nonradioactive methods to label nucleic acids an the resultant rapid and greater sensitivity of detection has catapulted the technique of in situ hybridization to become the method of choice to locate of specific DNA and RNA sequences on chromosomes and in whole cells in cytological preparations in many areas of biology. It is being applied to problems of fundamental interest to basic cell and molecular biologists such as the organization of the interphase nucleus in the context of putative functional domains; it is making major contributions to genome mapping efforts; and it is being applied to the analysis of clinical specimens. Although fluorescence detection of nucleic acid hybrids is routinely used, certain questions require greater resolution. For example, very closely linked sequences may not be separable using fluorescence; the precise location of sequences with respect to chromosome structures may be below the resolution of light microscopy(LM); and the relative positions of sequences on very small chromosomes may not be feasible.

Download Full-text

Large-Scale, High-Throughput Validation of Short Hairpin RNA Sequences for RNA Interference

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057105284342 ◽

2006 ◽

Vol 11 (3) ◽

pp. 236-246 ◽

Cited By ~ 6

Author(s):

Laurence H. Lamarcq ◽

Bradley J. Scherer ◽

Michael L. Phelan ◽

Nikolai N. Kalnine ◽

Yen H. Nguyen ◽

...

Keyword(s):

High Throughput ◽

Large Scale ◽

Strong Support ◽

Gc Content ◽

Rapid Identification ◽

Hairpin Rna ◽

Rna Sequences ◽

Short Hairpin ◽

Short Hairpin Rnas ◽

Interfering Rna

A method for high-throughput cloning and analysis of short hairpin RNAs (shRNAs) is described. Using this approach, 464 shRNAs against 116 different genes were screened for knockdown efficacy, enabling rapid identification of effective shRNAs against 74 genes. Statistical analysis of the effects of various criteria on the activity of the shRNAs confirmed that some of the rules thought to govern small interfering RNA (siRNA) activity also apply to shRNAs. These include moderate GC content, absence of internal hairpins, and asymmetric thermal stability. However, the authors did not find strong support for positionspecific rules. In addition, analysis of the data suggests that not all genes are equally susceptible to RNAinterference (RNAi).

Download Full-text

A high-throughput screening method for evolving a demethylase enzyme with improved and new functionalities

Nucleic Acids Research ◽

10.1093/nar/gkaa1213 ◽

2020 ◽

Author(s):

Yuru Wang ◽

Christopher D Katanski ◽

Christopher Watkins ◽

Jessica N Pan ◽

Qing Dai ◽

...

Keyword(s):

High Throughput ◽

High Throughput Screening ◽

Screening Method ◽

Targeted Mutagenesis ◽

Rna Repair ◽

Dna And Rna ◽

Modified Dna ◽

Dna Substrates ◽

Trna Sequencing

Abstract AlkB is a DNA/RNA repair enzyme that removes base alkylations such as N1-methyladenosine (m1A) or N3-methylcytosine (m3C) from DNA and RNA. The AlkB enzyme has been used as a critical tool to facilitate tRNA sequencing and identification of mRNA modifications. As a tool, AlkB mutants with better reactivity and new functionalities are highly desired; however, previous identification of such AlkB mutants was based on the classical approach of targeted mutagenesis. Here, we introduce a high-throughput screening method to evaluate libraries of AlkB variants for demethylation activity on RNA and DNA substrates. This method is based on a fluorogenic RNA aptamer with an internal modified RNA/DNA residue which can block reverse transcription or introduce mutations leading to loss of fluorescence inherent in the cDNA product. Demethylation by an AlkB variant eliminates the blockage or mutation thereby restores the fluorescence signals. We applied our screening method to sites D135 and R210 in the Escherichia coli AlkB protein and identified a variant with improved activity beyond a previously known hyperactive mutant toward N1-methylguanosine (m1G) in RNA. We also applied our method to O6-methylguanosine (O6mG) modified DNA substrates and identified candidate AlkB variants with demethylating activity. Our study provides a high-throughput screening method for in vitro evolution of any demethylase enzyme.

Download Full-text

Tracking DNA and RNA Sequences at High Resolution

Methods in Molecular Biology - Electron Microscopy ◽

10.1007/978-1-62703-776-1_16 ◽

2013 ◽

pp. 343-366 ◽

Cited By ~ 2

Author(s):

Dušan Cmarko ◽

Anna Ligasová ◽

Karel Koberna

Keyword(s):

High Resolution ◽

Rna Sequences ◽

Dna And Rna

Download Full-text