scholarly journals Deep learning model for metagenome fragment classification using spaced k-mers feature extraction

2020 ◽  
Vol 8 (3) ◽  
pp. 234-238
Author(s):  
Nur Choiriyati ◽  
Yandra Arkeman ◽  
Wisnu Ananta Kusuma

An open challenge in bioinformatics is the analysis of the sequenced metagenomes from the various environments. Several studies demonstrated bacteria classification at the genus level using k-mers as feature extraction where the highest value of k gives better accuracy but it is costly in terms of computational resources and computational time. Spaced k-mers method was used to extract the feature of the sequence using 111 1111 10001 where 1 was a match and 0 was the condition that could be a match or did not match. Currently, deep learning provides the best solutions to many problems in image recognition, speech recognition, and natural language processing. In this research, two different deep learning architectures, namely Deep Neural Network (DNN) and Convolutional Neural Network (CNN), trained to approach the taxonomic classification of metagenome data and spaced k-mers method for feature extraction. The result showed the DNN classifier reached 90.89 % and the CNN classifier reached 88.89 % accuracy at the genus level taxonomy.

Author(s):  
G. Rama Janani

The paper is based on classification of respiratory illness like covid 19 and pneumonia by using deep learning. The symptoms of COVID-19 and pneumonia are similar. Due to this, it is often difficult to identify what is causing your condition without being tested for COVID-19 or other respiratory infections. To find out how COVID-19 and pneumonia differs from one another, this paper presents that a novel Convolutional Neural Network in Tensor Flow and Keras based Covid-19 pneumonia classification. The proposed system supported implements CNN using Pneumonia images to classify the Covid-19, normal, pneumonia. The knowledge from these studies can potentially help in diagnosis of the concerned disease. It is predicted that the success of the anticipated results will increase if the CNN method is supported by adding extra feature extraction methods for classifying covid-19 and pneumonia successfully thereby improving the efficacy and potential of using deep CNN to pictures.


2020 ◽  
Vol 31 (13) ◽  
pp. 1346-1354 ◽  
Author(s):  
Yukiko Nagao ◽  
Mika Sakamoto ◽  
Takumi Chinen ◽  
Yasushi Okada ◽  
Daisuke Takao

By applying convolutional neural network-based classifiers, we demonstrate that cell images can be robustly classified according to cell cycle phases. Combined with Grad-CAM analysis, our approach enables us to extract biological features underlying cellular phenomena of interest in an unbiased and data-driven manner.


Computation ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 3
Author(s):  
Sima Sarv Ahrabi ◽  
Michele Scarpiniti ◽  
Enzo Baccarelli ◽  
Alireza Momenzadeh

In parallel with the vast medical research on clinical treatment of COVID-19, an important action to have the disease completely under control is to carefully monitor the patients. What the detection of COVID-19 relies on most is the viral tests, however, the study of X-rays is helpful due to the ease of availability. There are various studies that employ Deep Learning (DL) paradigms, aiming at reinforcing the radiography-based recognition of lung infection by COVID-19. In this regard, we make a comparison of the noteworthy approaches devoted to the binary classification of infected images by using DL techniques, then we also propose a variant of a convolutional neural network (CNN) with optimized parameters, which performs very well on a recent dataset of COVID-19. The proposed model’s effectiveness is demonstrated to be of considerable importance due to its uncomplicated design, in contrast to other presented models. In our approach, we randomly put several images of the utilized dataset aside as a hold out set; the model detects most of the COVID-19 X-rays correctly, with an excellent overall accuracy of 99.8%. In addition, the significance of the results obtained by testing different datasets of diverse characteristics (which, more specifically, are not used in the training process) demonstrates the effectiveness of the proposed approach in terms of an accuracy up to 93%.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ning Cheng ◽  
Yue Chen ◽  
Wanqing Gao ◽  
Jiajun Liu ◽  
Qunfu Huang ◽  
...  

Purpose: This study proposes an S-TextBLCNN model for the efficacy of traditional Chinese medicine (TCM) formula classification. This model uses deep learning to analyze the relationship between herb efficacy and formula efficacy, which is helpful in further exploring the internal rules of formula combination.Methods: First, for the TCM herbs extracted from Chinese Pharmacopoeia, natural language processing (NLP) is used to learn and realize the quantitative expression of different TCM herbs. Three features of herb name, herb properties, and herb efficacy are selected to encode herbs and to construct formula-vector and herb-vector. Then, based on 2,664 formulae for stroke collected in TCM literature and 19 formula efficacy categories extracted from Yifang Jijie, an improved deep learning model TextBLCNN consists of a bidirectional long short-term memory (Bi-LSTM) neural network and a convolutional neural network (CNN) is proposed. Based on 19 formula efficacy categories, binary classifiers are established to classify the TCM formulae. Finally, aiming at the imbalance problem of formula data, the over-sampling method SMOTE is used to solve it and the S-TextBLCNN model is proposed.Results: The formula-vector composed of herb efficacy has the best effect on the classification model, so it can be inferred that there is a strong relationship between herb efficacy and formula efficacy. The TextBLCNN model has an accuracy of 0.858 and an F1-score of 0.762, both higher than the logistic regression (acc = 0.561, F1-score = 0.567), SVM (acc = 0.703, F1-score = 0.591), LSTM (acc = 0.723, F1-score = 0.621), and TextCNN (acc = 0.745, F1-score = 0.644) models. In addition, the over-sampling method SMOTE is used in our model to tackle data imbalance, and the F1-score is greatly improved by an average of 47.1% in 19 models.Conclusion: The combination of formula feature representation and the S-TextBLCNN model improve the accuracy in formula efficacy classification. It provides a new research idea for the study of TCM formula compatibility.


Materials ◽  
2021 ◽  
Vol 14 (22) ◽  
pp. 7027
Author(s):  
Stephania Kossman ◽  
Maxence Bigerelle

High–speed nanoindentation rapidly generates large datasets, opening the door for advanced data analysis methods such as the resources available in artificial intelligence. The present study addresses the problem of differentiating load–displacement curves presenting pop-in, slope changes, or instabilities from curves exhibiting a typical loading path in large nanoindentation datasets. Classification of the curves was achieved with a deep learning model, specifically, a convolutional neural network (CNN) model implemented in Python using TensorFlow and Keras libraries. Load–displacement curves (with pop-in and without pop-in) from various materials were input to train and validate the model. The curves were converted into square matrices (50 × 50) and then used as inputs for the CNN model. The model successfully differentiated between pop-in and non-pop-in curves with approximately 93% accuracy in the training and validation datasets, indicating that the risk of overfitting the model was negligible. These results confirmed that artificial intelligence and computer vision models represent a powerful tool for analyzing nanoindentation data.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0256500
Author(s):  
Maleika Heenaye-Mamode Khan ◽  
Nazmeen Boodoo-Jahangeer ◽  
Wasiimah Dullull ◽  
Shaista Nathire ◽  
Xiaohong Gao ◽  
...  

The real cause of breast cancer is very challenging to determine and therefore early detection of the disease is necessary for reducing the death rate due to risks of breast cancer. Early detection of cancer boosts increasing the survival chance up to 8%. Primarily, breast images emanating from mammograms, X-Rays or MRI are analyzed by radiologists to detect abnormalities. However, even experienced radiologists face problems in identifying features like micro-calcifications, lumps and masses, leading to high false positive and high false negative. Recent advancement in image processing and deep learning create some hopes in devising more enhanced applications that can be used for the early detection of breast cancer. In this work, we have developed a Deep Convolutional Neural Network (CNN) to segment and classify the various types of breast abnormalities, such as calcifications, masses, asymmetry and carcinomas, unlike existing research work, which mainly classified the cancer into benign and malignant, leading to improved disease management. Firstly, a transfer learning was carried out on our dataset using the pre-trained model ResNet50. Along similar lines, we have developed an enhanced deep learning model, in which learning rate is considered as one of the most important attributes while training the neural network. The learning rate is set adaptively in our proposed model based on changes in error curves during the learning process involved. The proposed deep learning model has achieved a performance of 88% in the classification of these four types of breast cancer abnormalities such as, masses, calcifications, carcinomas and asymmetry mammograms.


Author(s):  
Ying He ◽  
Zhen Shen ◽  
Qinhu Zhang ◽  
Siguo Wang ◽  
De-Shuang Huang

Abstract DNA/RNA motif mining is the foundation of gene function research. The DNA/RNA motif mining plays an extremely important role in identifying the DNA- or RNA-protein binding site, which helps to understand the mechanism of gene regulation and management. For the past few decades, researchers have been working on designing new efficient and accurate algorithms for mining motif. These algorithms can be roughly divided into two categories: the enumeration approach and the probabilistic method. In recent years, machine learning methods had made great progress, especially the algorithm represented by deep learning had achieved good performance. Existing deep learning methods in motif mining can be roughly divided into three types of models: convolutional neural network (CNN) based models, recurrent neural network (RNN) based models, and hybrid CNN–RNN based models. We introduce the application of deep learning in the field of motif mining in terms of data preprocessing, features of existing deep learning architectures and comparing the differences between the basic deep learning models. Through the analysis and comparison of existing deep learning methods, we found that the more complex models tend to perform better than simple ones when data are sufficient, and the current methods are relatively simple compared with other fields such as computer vision, language processing (NLP), computer games, etc. Therefore, it is necessary to conduct a summary in motif mining by deep learning, which can help researchers understand this field.


2020 ◽  
Vol 13 (4) ◽  
pp. 627-640 ◽  
Author(s):  
Avinash Chandra Pandey ◽  
Dharmveer Singh Rajpoot

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.


Sign in / Sign up

Export Citation Format

Share Document