scholarly journals TIS transformer: Re-annotation of the human proteome using deep learning

2021 ◽  
Author(s):  
Jim Clauwaert ◽  
Zahra McVey ◽  
Ramneek Gupta ◽  
Gerben Menschaert

The precise detection of translation initiation sites is essential for proteome delineation. In turn, the accurate mapping of the proteome is fundamental in advancing our understanding of biological systems and cellular mechanisms. We propose TIS Transformer, a deep learning model for the determination of translation start sites, based on information embedded in processed transcript nucleotide sequences. Through the application of deep learning techniques first designed for natural language processing tasks, we have developed an approach that achieves state-of-the-art performances on the prediction of translation initiation sites. TIS Transformer utilizes the FAVOR+ algorithm for attention calculation, enabling processing of full transcript sequences by the model. Analysis of input importance revealed TIS Transformer's ability to detect key features of translation, such as translation stop sites and reading frames. Furthermore, we demonstrate TIS Transformer's ability to detect multiple peptides on a transcript, and peptides encoded by short Open Reading Frames (sORFs), either alongside a canonical coding sequence or in long non-coding RNAs. Using a cross-validation scheme, we apply TIS Transformer to re-annotate the full human transcriptome.

Author(s):  
Chong Chen ◽  
Ying Liu ◽  
Xianfang Sun ◽  
Shixuan Wang ◽  
Carla Di Cairano-Gilfedder ◽  
...  

Over the last few decades, reliability analysis has gained more and more attention as it can be beneficial in lowering the maintenance cost. Time between failures (TBF) is an essential topic in reliability analysis. If the TBF can be accurately predicted, preventive maintenance can be scheduled in advance in order to avoid critical failures. The purpose of this paper is to research the TBF using deep learning techniques. Deep learning, as a tool capable of capturing the highly complex and nonlinearly patterns, can be a useful tool for TBF prediction. The general principle of how to design deep learning model was introduced. By using a sizeable amount of automobile TBF dataset, we conduct an experiential study on TBF prediction by deep learning and several data mining approaches. The empirical results show the merits of deep learning in performance but comes with cost of high computational load.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Sunil Kumar Prabhakar ◽  
Dong-Ok Won

To unlock information present in clinical description, automatic medical text classification is highly useful in the arena of natural language processing (NLP). For medical text classification tasks, machine learning techniques seem to be quite effective; however, it requires extensive effort from human side, so that the labeled training data can be created. For clinical and translational research, a huge quantity of detailed patient information, such as disease status, lab tests, medication history, side effects, and treatment outcomes, has been collected in an electronic format, and it serves as a valuable data source for further analysis. Therefore, a huge quantity of detailed patient information is present in the medical text, and it is quite a huge challenge to process it efficiently. In this work, a medical text classification paradigm, using two novel deep learning architectures, is proposed to mitigate the human efforts. The first approach is that a quad channel hybrid long short-term memory (QC-LSTM) deep learning model is implemented utilizing four channels, and the second approach is that a hybrid bidirectional gated recurrent unit (BiGRU) deep learning model with multihead attention is developed and implemented successfully. The proposed methodology is validated on two medical text datasets, and a comprehensive analysis is conducted. The best results in terms of classification accuracy of 96.72% is obtained with the proposed QC-LSTM deep learning model, and a classification accuracy of 95.76% is obtained with the proposed hybrid BiGRU deep learning model.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Juncai Li ◽  
Xiaofei Jiang

Molecular property prediction is an essential task in drug discovery. Most computational approaches with deep learning techniques either focus on designing novel molecular representation or combining with some advanced models together. However, researchers pay fewer attention to the potential benefits in massive unlabeled molecular data (e.g., ZINC). This task becomes increasingly challenging owing to the limitation of the scale of labeled data. Motivated by the recent advancements of pretrained models in natural language processing, the drug molecule can be naturally viewed as language to some extent. In this paper, we investigate how to develop the pretrained model BERT to extract useful molecular substructure information for molecular property prediction. We present a novel end-to-end deep learning framework, named Mol-BERT, that combines an effective molecular representation with pretrained BERT model tailored for molecular property prediction. Specifically, a large-scale prediction BERT model is pretrained to generate the embedding of molecular substructures, by using four million unlabeled drug SMILES (i.e., ZINC 15 and ChEMBL 27). Then, the pretrained BERT model can be fine-tuned on various molecular property prediction tasks. To examine the performance of our proposed Mol-BERT, we conduct several experiments on 4 widely used molecular datasets. In comparison to the traditional and state-of-the-art baselines, the results illustrate that our proposed Mol-BERT can outperform the current sequence-based methods and achieve at least 2% improvement on ROC-AUC score on Tox21, SIDER, and ClinTox dataset.


2021 ◽  
Vol 9 (2) ◽  
pp. 1051-1052
Author(s):  
K. Kavitha, Et. al.

Sentiments is the term of opinion or views about any topic expressed by the people through a source of communication. Nowadays social media is an effective platform for people to communicate and it generates huge amount of unstructured details every day. It is essential for any business organization in the current era to process and analyse the sentiments by using machine learning and Natural Language Processing (NLP) strategies. Even though in recent times the deep learning strategies are becoming more familiar due to higher capabilities of performance. This paper represents an empirical study of an application of deep learning techniques in Sentiment Analysis (SA) for sarcastic messages and their increasing scope in real time. Taxonomy of the sentiment analysis in recent times and their key terms are also been highlighted in the manuscript. The survey concludes the recent datasets considered, their key contributions and the performance of deep learning model applied with its primary purpose like sarcasm detection in order to describe the efficiency of deep learning frameworks in the domain of sentimental analysis.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Kazi Nabiul Alam ◽  
Md Shakib Khan ◽  
Abdur Rab Dhruba ◽  
Mohammad Monirujjaman Khan ◽  
Jehad F. Al-Amri ◽  
...  

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.


2021 ◽  
Author(s):  
Saniya Karnik ◽  
Navya Yenuganti ◽  
Bonang Firmansyah Jusri ◽  
Supriya Gupta ◽  
Prasanna Nirgudkar ◽  
...  

Abstract Today, Electrical Submersible Pump (ESP) failure analysis is a tedious, human-intensive, and time-consuming activity involving dismantle, inspection, and failure analysis (DIFA) for each failure. This paper presents a novel artificial intelligence workflow using an ensemble of machine learning (ML) algorithms coupled with natural language processing (NLP) and deep learning (DL). The algorithms outlined in this paper bring together structured and unstructured data across equipment, production, operations, and failure reports to automate root cause identification and analysis post breakdown. This process will result in reduced turnaround time (TAT) and human effort thus drastically improving process efficiency.


Author(s):  
Yilin Yan ◽  
Jonathan Chen ◽  
Mei-Ling Shyu

Stance detection is an important research direction which attempts to automatically determine the attitude (positive, negative, or neutral) of the author of text (such as tweets), towards a target. Nowadays, a number of frameworks have been proposed using deep learning techniques that show promising results in application domains such as automatic speech recognition and computer vision, as well as natural language processing (NLP). This article shows a novel deep learning-based fast stance detection framework in bipolar affinities on Twitter. It is noted that millions of tweets regarding Clinton and Trump were produced per day on Twitter during the 2016 United States presidential election campaign, and thus it is used as a test use case because of its significant and unique counter-factual properties. In addition, stance detection can be utilized to imply the political tendency of the general public. Experimental results show that the proposed framework achieves high accuracy results when compared to several existing stance detection methods.


2020 ◽  
Vol 10 (21) ◽  
pp. 7751
Author(s):  
Seong-Jae Hong ◽  
Won-Kyung Baek ◽  
Hyung-Sup Jung

Synthetic aperture radar (SAR) images have been used in many studies for ship detection because they can be captured without being affected by time and weather. In recent years, the development of deep learning techniques has facilitated studies on ship detection in SAR images using deep learning techniques. However, because the noise from SAR images can negatively affect the learning of the deep learning model, it is necessary to reduce the noise through preprocessing. In this study, deep learning vessel detection was performed using preprocessed SAR images, and the effects of the preprocessing of the images on deep learning vessel detection were compared and analyzed. Through the preprocessing of SAR images, (1) intensity images, (2) decibel images, and (3) intensity difference and texture images were generated. The M2Det object detection model was used for the deep learning process and preprocessed SAR images. After the object detection model was trained, ship detection was performed using test images. The test results are presented in terms of precision, recall, and average precision (AP), which were 93.18%, 91.11%, and 89.78% for the intensity images, respectively, 94.16%, 94.16%, and 92.34% for the decibel images, respectively, and 97.40%, 94.94%, and 95.55% for the intensity difference and texture images, respectively. From the results, it can be found that the preprocessing of the SAR images can facilitate the deep learning process and improve the ship detection performance. The results of this study are expected to contribute to the development of deep learning-based ship detection techniques in SAR images in the future.


2020 ◽  
Vol 26 (1) ◽  
Author(s):  
Ayei E. Ibor ◽  
Florence A. Oladeji ◽  
Olusoji B. Okunoye ◽  
Charles O. Uwadia

The prediction of cyberattacks has been a major concern in cybersecurity. This is due to the huge financial and resource losses incurred by organisations after a cyberattack. The emergence of new applications and disruptive technologies has come with new vulnerabilities, most of which are novel – with no immediate remediation available. Recent attacks signatures are becoming evasive, deploying very complex techniques and algorithms to infiltrate a network, leading to unauthorized access and modification of system parameters and classified data. Although there exists several approaches to mitigating attacks, challenges of using known attack signatures and modeled behavioural profiles of network environments still linger. Consequently, this paper discusses the use of unsupervised statistical and supervised deep learning techniques to predict attacks by mapping hyper-alerts to class labels of attacks. This enhances the processes of feature extraction and transformation, as a means of giving structured interpretation of the dynamic profiles of a network.Keywords: Alert correlation, Cyberattack prediction, Cybersecurity, Deep learning, Cyberattacks, Supervised and Unsupervised LearningVol. 26 No 1, June 2019


Sign in / Sign up

Export Citation Format

Share Document