scholarly journals A Robust Feature Extraction and Deep Learning Approach for Cancer Gene Prognosis

Author(s):  
P Kamala Kumari ◽  
Joseph Beatrice Seventline

Mutated genes are one of the prominent factors in origination and spread of cancer disease. Here we have used Genomic signal processing methods to identify the patterns that differentiate cancer and non-cancerous genes. Furthermore, Deep learning algorithms were used to model a system that automatically predicts the cancer gene. Unlike the existing methods, two feature extraction modules are deployed to extract six attributes. Power Spectral Density based module was used to extract statistical parameters like Mean, Median, Standard deviation, Mean Deviation and Median Deviation. Adaptive Functional Link Network (AFLN) based filter module was used to extract Normalized Mean Square Error (NMSE). The uniqueness of this paper is identification of six input features that differentiates cancer genes. In this work artificial neural network is developed to predict cancer genes. Comparison is done on three sets of datasets with 6 attributes, 5 attributes and one attribute. We performed all the training and testing on the Tensorflow using the Keras library in Python using Google Colab. The developed approach proved its efficiency with 6 attributes attaining an accuracy of 98% for 150 epochs. The ANN model was also compared with existing work and attained a 10 fold cross validation accuracy of 96.26% with an increase of 1.2%.

2020 ◽  
Vol 39 (4) ◽  
pp. 5699-5711
Author(s):  
Shirong Long ◽  
Xuekong Zhao

The smart teaching mode overcomes the shortcomings of traditional teaching online and offline, but there are certain deficiencies in the real-time feature extraction of teachers and students. In view of this, this study uses the particle swarm image recognition and deep learning technology to process the intelligent classroom video teaching image and extracts the classroom task features in real time and sends them to the teacher. In order to overcome the shortcomings of the premature convergence of the standard particle swarm optimization algorithm, an improved strategy for multiple particle swarm optimization algorithms is proposed. In order to improve the premature problem in the search performance algorithm of PSO algorithm, this paper combines the algorithm with the useful attributes of other algorithms to improve the particle diversity in the algorithm, enhance the global search ability of the particle, and achieve effective feature extraction. The research indicates that the method proposed in this paper has certain practical effects and can provide theoretical reference for subsequent related research.


2020 ◽  
Vol 16 (8) ◽  
pp. 1088-1105
Author(s):  
Nafiseh Vahedi ◽  
Majid Mohammadhosseini ◽  
Mehdi Nekoei

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.


2020 ◽  
Author(s):  
Anusha Ampavathi ◽  
Vijaya Saradhi T

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.


2021 ◽  
Vol 13 (8) ◽  
pp. 1602
Author(s):  
Qiaoqiao Sun ◽  
Xuefeng Liu ◽  
Salah Bourennane

Deep learning models have strong abilities in learning features and they have been successfully applied in hyperspectral images (HSIs). However, the training of most deep learning models requires labeled samples and the collection of labeled samples are labor-consuming in HSI. In addition, single-level features from a single layer are usually considered, which may result in the loss of some important information. Using multiple networks to obtain multi-level features is a solution, but at the cost of longer training time and computational complexity. To solve these problems, a novel unsupervised multi-level feature extraction framework that is based on a three dimensional convolutional autoencoder (3D-CAE) is proposed in this paper. The designed 3D-CAE is stacked by fully 3D convolutional layers and 3D deconvolutional layers, which allows for the spectral-spatial information of targets to be mined simultaneously. Besides, the 3D-CAE can be trained in an unsupervised way without involving labeled samples. Moreover, the multi-level features are directly obtained from the encoded layers with different scales and resolutions, which is more efficient than using multiple networks to get them. The effectiveness of the proposed multi-level features is verified on two hyperspectral data sets. The results demonstrate that the proposed method has great promise in unsupervised feature learning and can help us to further improve the hyperspectral classification when compared with single-level features.


2021 ◽  
pp. 1063293X2198894
Author(s):  
Prabira Kumar Sethy ◽  
Santi Kumari Behera ◽  
Nithiyakanthan Kannan ◽  
Sridevi Narayanan ◽  
Chanki Pandey

Paddy is an essential nutrient worldwide. Rice gives 21% of worldwide human per capita energy and 15% of per capita protein. Asia represented 60% of the worldwide populace, about 92% of the world’s rice creation, and 90% of worldwide rice utilization. With the increase in population, the demand for rice is increased. So, the productivity of farming is needed to be enhanced by introducing new technology. Deep learning and IoT are hot topics for research in various fields. This paper suggested a setup comprising deep learning and IoT for monitoring of paddy field remotely. The vgg16 pre-trained network is considered for the identification of paddy leaf diseases and nitrogen status estimation. Here, two strategies are carried out to identify images: transfer learning and deep feature extraction. The deep feature extraction approach is combined with a support vector machine (SVM) to classify images. The transfer learning approach of vgg16 for identifying four types of leaf diseases and prediction of nitrogen status results in 79.86% and 84.88% accuracy. Again, the deep features of Vgg16 and SVM results for identifying four types of leaf diseases and prediction of nitrogen status have achieved an accuracy of 97.31% and 99.02%, respectively. Besides, a framework is suggested for monitoring of paddy field remotely based on IoT and deep learning. The suggested prototype’s superiority is that it controls temperature and humidity like the state-of-the-art and can monitor the additional two aspects, such as detecting nitrogen status and diseases.


2021 ◽  
Vol 7 (5) ◽  
pp. 89
Author(s):  
George K. Sidiropoulos ◽  
Polixeni Kiratsa ◽  
Petros Chatzipetrou ◽  
George A. Papakostas

This paper aims to provide a brief review of the feature extraction methods applied for finger vein recognition. The presented study is designed in a systematic way in order to bring light to the scientific interest for biometric systems based on finger vein biometric features. The analysis spans over a period of 13 years (from 2008 to 2020). The examined feature extraction algorithms are clustered into five categories and are presented in a qualitative manner by focusing mainly on the techniques applied to represent the features of the finger veins that uniquely prove a human’s identity. In addition, the case of non-handcrafted features learned in a deep learning framework is also examined. The conducted literature analysis revealed the increased interest in finger vein biometric systems as well as the high diversity of different feature extraction methods proposed over the past several years. However, last year this interest shifted to the application of Convolutional Neural Networks following the general trend of applying deep learning models in a range of disciplines. Finally, yet importantly, this work highlights the limitations of the existing feature extraction methods and describes the research actions needed to face the identified challenges.


Symmetry ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1156
Author(s):  
Mohamed Yusuf Hassan

The most effective techniques for predicting time series patterns include machine learning and classical time series methods. The aim of this study is to search for the best artificial intelligence and classical forecasting techniques that can predict the spread of acute respiratory infection (ARI) and pneumonia among under-five-year old children in Somaliland. The techniques used in the study include seasonal autoregressive integrated moving averages (SARIMA), mixture transitions distribution (MTD), and long short term memory (LSTM) deep learning. The data used in the study were monthly observations collected from five regions in Somaliland from 2011–2014. Prediction results from the three best competing models are compared by using root mean square error (RMSE) and absolute mean deviation (MAD) accuracy measures. Results have shown that the deep learning LSTM and MTD models slightly outperformed the classical SARIMA model in predicting ARI values.


2009 ◽  
Vol 7 (44) ◽  
pp. 423-437 ◽  
Author(s):  
Tijana Milenković ◽  
Vesna Memišević ◽  
Anand K. Ganesan ◽  
Nataša Pržulj

Many real-world phenomena have been described in terms of large networks. Networks have been invaluable models for the understanding of biological systems. Since proteins carry out most biological processes, we focus on analysing protein–protein interaction (PPI) networks. Proteins interact to perform a function. Thus, PPI networks reflect the interconnected nature of biological processes and analysing their structural properties could provide insights into biological function and disease. We have already demonstrated, by using a sensitive graph theoretic method for comparing topologies of node neighbourhoods called ‘graphlet degree signatures’, that proteins with similar surroundings in PPI networks tend to perform the same functions. Here, we explore whether the involvement of genes in cancer suggests the similarity of their topological ‘signatures’ as well. By applying a series of clustering methods to proteins' topological signature similarities, we demonstrate that the obtained clusters are significantly enriched with cancer genes. We apply this methodology to identify novel cancer gene candidates, validating 80 per cent of our predictions in the literature. We also validate predictions biologically by identifying cancer-related negative regulators of melanogenesis identified in our siRNA screen. This is encouraging, since we have done this solely from PPI network topology. We provide clear evidence that PPI network structure around cancer genes is different from the structure around non-cancer genes. Understanding the underlying principles of this phenomenon is an open question, with a potential for increasing our understanding of complex diseases.


Sign in / Sign up

Export Citation Format

Share Document