TransPrise: a novel machine learning approach for eukaryotic promoter prediction

As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise - an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at ( http://compubioverne.group /). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.

Download Full-text

TransPrise: a novel machine learning approach for eukaryotic promoter prediction

PeerJ ◽

10.7717/peerj.7990 ◽

2019 ◽

Vol 7 ◽

pp. e7990 ◽

Cited By ~ 2

Author(s):

Stepan Pachganov ◽

Khalimat Murtazalieva ◽

Aleksei Zarubin ◽

Dmitry Sokolov ◽

Duane R. Chartier ◽

...

Keyword(s):

Regression Model ◽

Homo Sapiens ◽

Mean Absolute Error ◽

Source Code ◽

Absolute Error ◽

Processing Unit ◽

Promoter Prediction ◽

Transcription Start ◽

Transcription Start Sites ◽

A Genome

As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise—an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. Our pipeline consists of two parts: the binary classifier operates the first, and if a sequence is classified as TSS-containing the regression step follows, where the precise location of TSS is being identified. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise classification and regression models with the TSSPlant approach for the well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. The Matthews correlation coefficient value for TransPrise is 0.79, more than two times larger than the 0.31 for TSSPlant classification models. This represents a high level of prediction accuracy. Additionally, the mean absolute error for the regression model is 29.19 nt, allowing for accurate prediction of TSS location. TransPrise was also tested in Homo sapiens, where mean absolute error of the regression model was 47.986 nt. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at (http://compubioverne.group/). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.

Download Full-text

ReCappable Seq: Comprehensive Determination of Transcription Start Sites derived from all RNA polymerases

10.1101/696559 ◽

2019 ◽

Cited By ~ 1

Author(s):

Bo Yan ◽

George Tzertzinis ◽

Ira Schildkraut ◽

Laurence Ettwiller

Keyword(s):

Rna Polymerases ◽

Transcription Start ◽

Eukaryotic Transcription ◽

Pol Ii ◽

Transcription Start Sites ◽

Cell Functions ◽

A Genome ◽

Wide Scale ◽

Nucleotide Resolution ◽

Pol Iii

AbstractMethodologies for determining eukaryotic Transcription Start Sites (TSS) rely on the selection of the 5’ canonical cap structure of Pol-II transcripts and are consequently ignoring entire classes of TSS derived from other RNA polymerases which play critical roles in various cell functions. To overcome this limitation, we developed ReCappable-seq and identified TSS from Pol-ll and non-Pol-II transcripts at nucleotide resolution. Applied to the human transcriptome, ReCappable-seq identifies Pol-II TSS with higher specificity than CAGE and reveals a rich landscape of TSS associated notably with Pol-III transcripts which have been previously not possible to study on a genome-wide scale. Novel TSS consistent with non-Pol-II transcripts can be found in the nuclear and mitochondrial genomes. By identifying TSS derived from all RNA-polymerases, ReCappable-seq reveals distinct epigenetic marks among Pol-lI and non-Pol-II TSS and provides a unique opportunity to concurrently interrogate the regulatory landscape of coding and non-coding RNA.

Download Full-text

Fast iterative solvers for large compressed-sparse row linear systems on graphics processing unit

Pollack Periodica ◽

10.1556/pollack.10.2015.1.1 ◽

2015 ◽

Vol 10 (1) ◽

pp. 3-18 ◽

Cited By ~ 1

Author(s):

Frédéric Magoulès ◽

Abal-Kassim Cheik Ahamed ◽

Roman Putanowicz

Keyword(s):

Linear Systems ◽

Graphics Processing Unit ◽

Iterative Solvers ◽

Processing Unit ◽

Compressed Sparse Row ◽

Graphics Processing

Download Full-text

Performance Analysis and Optimization of Graphics Processing Unit

SSRN Electronic Journal ◽

10.2139/ssrn.3350249 ◽

2019 ◽

Author(s):

Lokendra Singh Umrao ◽

Jay Prakash Pandey

Keyword(s):

Performance Analysis ◽

Graphics Processing Unit ◽

Processing Unit ◽

Graphics Processing

Download Full-text

Implementing wide baseline matching algorithms on a graphics processing unit.

10.2172/921737 ◽

2007 ◽

Author(s):

Fredrick H. Rothganger ◽

Kurt W. Larson ◽

Antonio Ignacio Gonzales ◽

Daniel S. Myers

Keyword(s):

Graphics Processing Unit ◽

Processing Unit ◽

Wide Baseline Matching ◽

Graphics Processing

Download Full-text

Two Decades of 4D-QSAR: A Dying Art or Staging a Comeback?

International Journal of Molecular Sciences ◽

10.3390/ijms22105212 ◽

2021 ◽

Vol 22 (10) ◽

pp. 5212

Author(s):

Andrzej Bak

Keyword(s):

Molecular Conformation ◽

Graphics Processing Unit ◽

Processing Unit ◽

Diverse Range ◽

Current State ◽

Gpu Clusters ◽

Pharmacophore Hypothesis ◽

Rising Power ◽

Graphics Processing ◽

Ligand Conformation

A key question confronting computational chemists concerns the preferable ligand geometry that fits complementarily into the receptor pocket. Typically, the postulated ‘bioactive’ 3D ligand conformation is constructed as a ‘sophisticated guess’ (unnecessarily geometry-optimized) mirroring the pharmacophore hypothesis—sometimes based on an erroneous prerequisite. Hence, 4D-QSAR scheme and its ‘dialects’ have been practically implemented as higher level of model abstraction that allows the examination of the multiple molecular conformation, orientation and protonation representation, respectively. Nearly a quarter of a century has passed since the eminent work of Hopfinger appeared on the stage; therefore the natural question occurs whether 4D-QSAR approach is still appealing to the scientific community? With no intention to be comprehensive, a review of the current state of art in the field of receptor-independent (RI) and receptor-dependent (RD) 4D-QSAR methodology is provided with a brief examination of the ‘mainstream’ algorithms. In fact, a myriad of 4D-QSAR methods have been implemented and applied practically for a diverse range of molecules. It seems that, 4D-QSAR approach has been experiencing a promising renaissance of interests that might be fuelled by the rising power of the graphics processing unit (GPU) clusters applied to full-atom MD-based simulations of the protein-ligand complexes.

Download Full-text

Parallelization of Global Sequence Alignment on Graphics Processing Unit

2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI) ◽

10.1109/ccci49893.2020.9256747 ◽

2020 ◽

Author(s):

Kailash W. Kalare ◽

Mohammad S. Obaidat ◽

Jitendra V. Tembhurne ◽

Chandrashekhar Meshram ◽

Kuei-Fang Hsiao

Keyword(s):

Sequence Alignment ◽

Graphics Processing Unit ◽

Processing Unit ◽

Graphics Processing

Download Full-text

Graphics processing unit acceleration of the island model genetic algorithm using the CUDA programming platform

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6286 ◽

2021 ◽

Author(s):

Dylan M. Janssen ◽

Wayne Pullan ◽

Alan Wee‐Chung Liew

Keyword(s):

Genetic Algorithm ◽

Graphics Processing Unit ◽

Island Model ◽

Processing Unit ◽

Cuda Programming ◽

Graphics Processing

Download Full-text

Real-time, High-resolution Depth Upsampling on Embedded Accelerators

ACM Transactions on Embedded Computing Systems ◽

10.1145/3436878 ◽

2021 ◽

Vol 20 (3) ◽

pp. 1-22

Author(s):

David Langerman ◽

Alan George

Keyword(s):

High Resolution ◽

Low Power ◽

Real Time ◽

Mixed Reality ◽

Graphics Processing Unit ◽

Processing Unit ◽

Reconfigurable Logic ◽

Depth Sensors ◽

Time Requirements ◽

Graphics Processing

High-resolution, low-latency apps in computer vision are ubiquitous in today’s world of mixed-reality devices. These innovations provide a platform that can leverage the improving technology of depth sensors and embedded accelerators to enable higher-resolution, lower-latency processing for 3D scenes using depth-upsampling algorithms. This research demonstrates that filter-based upsampling algorithms are feasible for mixed-reality apps using low-power hardware accelerators. The authors parallelized and evaluated a depth-upsampling algorithm on two different devices: a reconfigurable-logic FPGA embedded within a low-power SoC; and a fixed-logic embedded graphics processing unit. We demonstrate that both accelerators can meet the real-time requirements of 11 ms latency for mixed-reality apps. 1

Download Full-text