PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm

Background: Pseudouridine (Ψ) is a common ribonucleotide modification that plays a significant role in many biological processes. The identification of Ψ modification sites is of great significance for disease mechanism and biological processes research in which machine learning algorithms are desirable as the lab exploratory techniques are expensive and time-consuming.Results: In this work, we propose a deep learning framework, called PseUdeep, to identify Ψ sites of three species: H. sapiens, S. cerevisiae, and M. musculus. In this method, three encoding methods are used to extract the features of RNA sequences, that is, one-hot encoding, K-tuple nucleotide frequency pattern, and position-specific nucleotide composition. The three feature matrices are convoluted twice and fed into the capsule neural network and bidirectional gated recurrent unit network with a self-attention mechanism for classification.Conclusion: Compared with other state-of-the-art methods, our model gets the highest accuracy of the prediction on the independent testing data set S-200; the accuracy improves 12.38%, and on the independent testing data set H-200, the accuracy improves 0.68%. Moreover, the dimensions of the features we derive from the RNA sequences are only 109,109, and 119 in H. sapiens, M. musculus, and S. cerevisiae, which is much smaller than those used in the traditional algorithms. On evaluation via tenfold cross-validation and two independent testing data sets, PseUdeep outperforms the best traditional machine learning model available. PseUdeep source code and data sets are available at https://github.com/dan111262/PseUdeep.

Download Full-text

Machine Learning Algorithms for Analysis of DNA Data Sets

Machine Learning Algorithms for Problem Solving in Computational Applications ◽

10.4018/978-1-4666-1833-6.ch004 ◽

2012 ◽

pp. 47-58 ◽

Cited By ~ 2

Author(s):

John Yearwood ◽

Adil Bagirov ◽

Andrei V. Kelarev

Keyword(s):

Machine Learning ◽

Dna Sequences ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Local Alignment ◽

Data Sets ◽

Data Set ◽

Applications Of Machine Learning ◽

New Machine

The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors’ experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors’ k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously.

Download Full-text

Human Activity Recognition using Fourier Transform Inspired Deep Learning Combination Model

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327908666180727123657 ◽

2019 ◽

Vol 9 (1) ◽

pp. 16-31

Author(s):

Kyungkoo Jun

Keyword(s):

Fourier Transform ◽

Deep Learning ◽

Short Term Memory ◽

Window Size ◽

Sensor Data ◽

Data Sets ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Labeling Scheme

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.

Download Full-text

Birds Sound Classification Based on Machine Learning Algorithms

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v9i430227 ◽

2021 ◽

pp. 1-11

Author(s):

Aska E. Mehyadin ◽

Adnan Mohsin Abdulazeez ◽

Dathar Abas Hasan ◽

Jwan N. Saeed

Keyword(s):

Machine Learning ◽

Noise Suppression ◽

Bird Species ◽

Machine Learning Algorithms ◽

Data Sets ◽

Learning Technology ◽

Species Classification ◽

Data Set ◽

Sound Classification ◽

Mel Frequency Cepstral Coefficient

The bird classifier is a system that is equipped with an area machine learning technology and uses a machine learning method to store and classify bird calls. Bird species can be known by recording only the sound of the bird, which will make it easier for the system to manage. The system also provides species classification resources to allow automated species detection from observations that can teach a machine how to recognize whether or classify the species. Non-undesirable noises are filtered out of and sorted into data sets, where each sound is run via a noise suppression filter and a separate classification procedure so that the most useful data set can be easily processed. Mel-frequency cepstral coefficient (MFCC) is used and tested through different algorithms, namely Naïve Bayes, J4.8 and Multilayer perceptron (MLP), to classify bird species. J4.8 has the highest accuracy (78.40%) and is the best. Accuracy and elapsed time are (39.4 seconds).

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

Stock price prediction using DEEP learning algorithm and its comparison with machine learning algorithms

Intelligent Systems in Accounting Finance & Management ◽

10.1002/isaf.1459 ◽

2019 ◽

Vol 26 (4) ◽

pp. 164-174 ◽

Cited By ~ 3

Author(s):

Mahla Nikou ◽

Gholamreza Mansourfar ◽

Jamshid Bagherzadeh

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Stock Price ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Stock Price Prediction ◽

Price Prediction ◽

Deep Learning Algorithm

Download Full-text

A Surveillance on Machine Learning Algorithms and Its Applications

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9064 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4294-4298

Author(s):

B. R. Sunil Kumar ◽

B. S. Siddhartha ◽

S. N. Shwetha ◽

K. Arpitha

Keyword(s):

Machine Learning ◽

Health Care ◽

Sentiment Analysis ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Data Set ◽

Pros And Cons ◽

Primary Advantage

This paper intends to use distinct machine learning algorithms and exploring its multi-features. The primary advantage of machine learning is, a machine learning algorithm can predict its work automatically by learning what to do with information. This paper reveals the concept of machine learning and its algorithms which can be used for different applications such as health care, sentiment analysis and many more. Sometimes the programmers will get confused which algorithm to apply for their applications. This paper provides an idea related to the algorithm used on the basis of how accurately it fits. Based on the collected data, one of the algorithms can be selected based upon its pros and cons. By considering the data set, the base model is developed, trained and tested. Then the trained model is ready for prediction and can be deployed on the basis of feasibility.

Download Full-text

Bees Assorter

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061212 ◽

2020 ◽

Vol 6 (12) ◽

pp. 61-65

Author(s):

Himanshu Verma

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Bumble Bee ◽

Wing Size ◽

Qualitative And Quantitative Analysis ◽

Data Set ◽

Qualitative And Quantitative ◽

The Difference

Many attempts were made to classify the bees that is bumble bee or honey bee , there have been such a large amount of researches which were made to seek out the difference between them on the premise of various features like wing size , size of bee , color, life cycle and many more. But altogether the analysis there have been either that specialize in qualitative or quantitative , but to beat this issue , thus researchers came up with an answer which might be both qualitative and quantitative analysis made to classify them. And making use of machine learning algorithm to classify them gives a lift . Now the classification would take less time as these algorithms are pretty fast and accurate . By using machine learning work is made easy . Lots of photographs had to be collected and stored for data set. And by using these machine learning algorithms we would be getting information about the bees which might be employed by researchers in further classification of bees. Manipulation of images had to be done so as on prepare them in such a way that they will be applied to the algorithms and have feature extraction done. As there have been a lot of photographs(data set) which take a lot of space and also the area in which bees were present in these photographs were too small so to accommodate it dimension reduction was done , it might not consider other images like trees , leaves , flowers which were there present in the photograph which we elect as a data set.

Download Full-text

A Self-Supervised Machine Learning Approach for Objective Live Cell Segmentation and Analysis

10.21203/rs.3.rs-147010/v1 ◽

2021 ◽

Author(s):

Marc Raphael ◽

Michael Robitaille ◽

Jeff Byers ◽

Joseph Christodoulides

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Label Cell ◽

Live Cell ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Cell Segmentation ◽

Data Set

Abstract Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm’s initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery’s optical modality, magnification or cell type.

Download Full-text

A Self-Supervised Machine Learning Approach for Objective Live Cell Segmentation and Analysis

10.1101/2021.01.07.425773 ◽

2021 ◽

Author(s):

Michael C. Robitaille ◽

Jeff M. Byers ◽

Joseph A. Christodoulides ◽

Marc P. Raphael

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Label Cell ◽

Live Cell ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Cell Segmentation ◽

Data Set

Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm's initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery's optical modality, magnification or cell type.

Download Full-text

Research on Network Security Application Based on Deep Learning

CONVERTER ◽

10.17762/converter.235 ◽

2021 ◽

pp. 598-605

Author(s):

Zhao Jianchao

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Rapid Development ◽

Internet Security ◽

Detection Accuracy ◽

Data Sets ◽

Learning Technology ◽

Data Set ◽

Deep Learning Algorithm ◽

Internet Industry

Behind the rapid development of the Internet industry, Internet security has become a hidden danger. In recent years, the outstanding performance of deep learning in classification and behavior prediction based on massive data makes people begin to study how to use deep learning technology. Therefore, this paper attempts to apply deep learning to intrusion detection to learn and classify network attacks. Aiming at the nsl-kdd data set, this paper first uses the traditional classification methods and several different deep learning algorithms for learning classification. This paper deeply analyzes the correlation among data sets, algorithm characteristics and experimental classification results, and finds out the deep learning algorithm which is relatively good at. Then, a normalized coding algorithm is proposed. The experimental results show that the algorithm can improve the detection accuracy and reduce the false alarm rate.

Download Full-text