scholarly journals Comparison of machine learning and deep learning techniques in promoter prediction across diverse species

2021 ◽  
Vol 7 ◽  
pp. e365
Author(s):  
Nikita Bhandari ◽  
Satyajeet Khare ◽  
Rahee Walambe ◽  
Ketan Kotecha

Gene promoters are the key DNA regulatory elements positioned around the transcription start sites and are responsible for regulating gene transcription process. Various alignment-based, signal-based and content-based approaches are reported for the prediction of promoters. However, since all promoter sequences do not show explicit features, the prediction performance of these techniques is poor. Therefore, many machine learning and deep learning models have been proposed for promoter prediction. In this work, we studied methods for vector encoding and promoter classification using genome sequences of three distinct higher eukaryotes viz. yeast (Saccharomyces cerevisiae), A. thaliana (plant) and human (Homo sapiens). We compared one-hot vector encoding method with frequency-based tokenization (FBT) for data pre-processing on 1-D Convolutional Neural Network (CNN) model. We found that FBT gives a shorter input dimension reducing the training time without affecting the sensitivity and specificity of classification. We employed the deep learning techniques, mainly CNN and recurrent neural network with Long Short Term Memory (LSTM) and random forest (RF) classifier for promoter classification at k-mer sizes of 2, 4 and 8. We found CNN to be superior in classification of promoters from non-promoter sequences (binary classification) as well as species-specific classification of promoter sequences (multiclass classification). In summary, the contribution of this work lies in the use of synthetic shuffled negative dataset and frequency-based tokenization for pre-processing. This study provides a comprehensive and generic framework for classification tasks in genomic applications and can be extended to various classification problems.

2020 ◽  
Vol 4 (2) ◽  
pp. 371-379
Author(s):  
David.O. Oyewola ◽  
Bernard Alechenu ◽  
Kuluwa A. Al-Mustapha ◽  
Oluwatoyosi .V. Oyewande

Dementia is the most frequent degenerative sickness in adults where early diagnosis can forestall or prolong progression. In this study, we used a deep learning techniques for classification of dementia. Data were collected from OASIS database of all the patients receiving dementia screening. The data included the patient’s sex, age, education, social economic status, Mini-Mental State Examination, Clinical Dementia Rating, Atlas Scaling Factor, Estimated Total Intracranial Volume and Normalized Whole Brain Volume. The performance of every algorithm is juxtaposed with Generalized Regression Neural Network (GRNN), Radial Basis Neural Network (RBNN), Multilayer Perceptron Neural Network (MPNN) and Long Short Term Memory (LSTM) using Sensitivity, Specificity, Detection Rate. The results show that with 100% efficiency, GRNN, RBNN and LSTM tend to be the best in the classification of dementia. The use of deep learning such as LSTM for early diagnosis of dementia can help improve the process of dementia diagnosis.


Vibration ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 341-356
Author(s):  
Jessada Sresakoolchai ◽  
Sakdirat Kaewunruen

Various techniques have been developed to detect railway defects. One of the popular techniques is machine learning. This unprecedented study applies deep learning, which is a branch of machine learning techniques, to detect and evaluate the severity of rail combined defects. The combined defects in the study are settlement and dipped joint. Features used to detect and evaluate the severity of combined defects are axle box accelerations simulated using a verified rolling stock dynamic behavior simulation called D-Track. A total of 1650 simulations are run to generate numerical data. Deep learning techniques used in the study are deep neural network (DNN), convolutional neural network (CNN), and recurrent neural network (RNN). Simulated data are used in two ways: simplified data and raw data. Simplified data are used to develop the DNN model, while raw data are used to develop the CNN and RNN model. For simplified data, features are extracted from raw data, which are the weight of rolling stock, the speed of rolling stock, and three peak and bottom accelerations from two wheels of rolling stock. In total, there are 14 features used as simplified data for developing the DNN model. For raw data, time-domain accelerations are used directly to develop the CNN and RNN models without processing and data extraction. Hyperparameter tuning is performed to ensure that the performance of each model is optimized. Grid search is used for performing hyperparameter tuning. To detect the combined defects, the study proposes two approaches. The first approach uses one model to detect settlement and dipped joint, and the second approach uses two models to detect settlement and dipped joint separately. The results show that the CNN models of both approaches provide the same accuracy of 99%, so one model is good enough to detect settlement and dipped joint. To evaluate the severity of the combined defects, the study applies classification and regression concepts. Classification is used to evaluate the severity by categorizing defects into light, medium, and severe classes, and regression is used to estimate the size of defects. From the study, the CNN model is suitable for evaluating dipped joint severity with an accuracy of 84% and mean absolute error (MAE) of 1.25 mm, and the RNN model is suitable for evaluating settlement severity with an accuracy of 99% and mean absolute error (MAE) of 1.58 mm.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2021 ◽  
Author(s):  
Ghazaala Yasmin ◽  
ASIT KUMAR DAS ◽  
Janmenjoy Nayak ◽  
S Vimal ◽  
Soumi Dutta

Abstract Speech is one of the most delicate medium through which gender of the speakers can easily be identified. Though the related research has shown very good progress in machine learning but recently, deep learning has imparted a very good research area to explore the deficiency of gender discrimination using traditional machine learning techniques. In deep learning techniques, the speech features are automatically generated by the reinforcement learning from the raw data which have more discriminating power than the human generated features. But in some practical situations like gender recognition, it is observed that combination of both types of features sometimes provides comparatively better performance. In the proposed work, we have initially extracted and selected some informative and precise acoustic features relevant to gender recognition using entropy based information theory and Rough Set Theory (RST). Next, the audio speech signals are directly fed into the deep neural network model consists of Convolution Neural Network (CNN) and Gated Recurrent Unit network (GRUN) for extracting features useful for gender recognition. The RST selects precise and informative features, CNN extracts the locally encoded important features, and GRUN reduces the vanishing gradient and exploding gradient problems. Finally, a hybrid gender recognition system is developed combining both generated feature vectors. The developed model has been tested with five bench mark and a simulated dataset to evaluate its performance and it is observed that combined feature vector provides more effective gender recognition system specially when transgender is considered as a gender type together with male and female.


Cataract is a degenerative condition that, according to estimations, will rise globally. Even though there are various proposals about its diagnosis, there are remaining problems to be solved. This paper aims to identify the current situation of the recent investigations on cataract diagnosis using a framework to conduct the literature review with the intention of answering the following research questions: RQ1) Which are the existing methods for cataract diagnosis? RQ2) Which are the features considered for the diagnosis of cataracts? RQ3) Which is the existing classification when diagnosing cataracts? RQ4) And Which obstacles arise when diagnosing cataracts? Additionally, a cross-analysis of the results was made. The results showed that new research is required in: (1) the classification of “congenital cataract” and, (2) portable solutions, which are necessary to make cataract diagnoses easily and at a low cost.


2020 ◽  
Vol 3 (1) ◽  
pp. 445-454
Author(s):  
Celal Buğra Kaya ◽  
Alperen Yılmaz ◽  
Gizem Nur Uzun ◽  
Zeynep Hilal Kilimci

Pattern classification is related with the automatic finding of regularities in dataset through the utilization of various learning techniques. Thus, the classification of the objects into a set of categories or classes is provided. This study is undertaken to evaluate deep learning methodologies to the classification of stock patterns. In order to classify patterns that are obtained from stock charts, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long-short term memory networks (LSTMs) are employed. To demonstrate the efficiency of proposed model in categorizing patterns, hand-crafted image dataset is constructed from stock charts in Istanbul Stock Exchange and NASDAQ Stock Exchange. Experimental results show that the usage of convolutional neural networks exhibits superior classification success in recognizing patterns compared to the other deep learning methodologies.


Algorithms ◽  
2018 ◽  
Vol 11 (11) ◽  
pp. 170 ◽  
Author(s):  
Zhixi Li ◽  
Vincent Tam

Momentum and reversal effects are important phenomena in stock markets. In academia, relevant studies have been conducted for years. Researchers have attempted to analyze these phenomena using statistical methods and to give some plausible explanations. However, those explanations are sometimes unconvincing. Furthermore, it is very difficult to transfer the findings of these studies to real-world investment trading strategies due to the lack of predictive ability. This paper represents the first attempt to adopt machine learning techniques for investigating the momentum and reversal effects occurring in any stock market. In the study, various machine learning techniques, including the Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron Neural Network (MLP), and Long Short-Term Memory Neural Network (LSTM) were explored and compared carefully. Several models built on these machine learning approaches were used to predict the momentum or reversal effect on the stock market of mainland China, thus allowing investors to build corresponding trading strategies. The experimental results demonstrated that these machine learning approaches, especially the SVM, are beneficial for capturing the relevant momentum and reversal effects, and possibly building profitable trading strategies. Moreover, we propose the corresponding trading strategies in terms of market states to acquire the best investment returns.


2020 ◽  
Vol 17 (4) ◽  
pp. 1925-1930
Author(s):  
Ambeshwar Kumar ◽  
R. Manikandan ◽  
Robbi Rahim

It’s a new era technology in the field of medical engineering giving awareness about the various healthcare features. Deep learning is a part of machine learning, it is capable of handling high dimensional data and is efficient in concentrating on the right features. Tumor is an unbelievably complex disease: a multifaceted cell has more than hundred billion cells; each cell acquires mutation exclusively. Detection of tumor particles in experiment is easily done by MRI or CT. Brain tumors can also be detected by MRI, however, deep learning techniques give a better approach to segment the brain tumor images. Deep Learning models are imprecisely encouraged by information handling and communication designs in biological nervous system. Classification plays an significant role in brain tumor detection. Neural network is creating a well-organized rule for classification. To accomplish medical image data, neural network is trained to use the Convolution algorithm. Multilayer perceptron is intended for identification of a image. In this study article, the brain images are categorized into two types: normal and abnormal. This article emphasize the importance of classification and feature selection approach for predicting the brain tumor. This classification is done by machine learning techniques like Artificial Neural Networks, Support Vector Machine and Deep Neural Network. It could be noted that more than one technique can be applied for the segmentation of tumor. The several samples of brain tumor images are classified using deep learning algorithms, convolution neural network and multi-layer perceptron.


Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 4 ◽  
Author(s):  
Jurgita Kapočiūtė-Dzikienė ◽  
Robertas Damaševičius ◽  
Marcin Woźniak

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.


Sign in / Sign up

Export Citation Format

Share Document