Application of deep learning methods in biological networks

Author(s):  
Shuting Jin ◽  
Xiangxiang Zeng ◽  
Feng Xia ◽  
Wei Huang ◽  
Xiangrong Liu

Abstract The increase in biological data and the formation of various biomolecule interaction databases enable us to obtain diverse biological networks. These biological networks provide a wealth of raw materials for further understanding of biological systems, the discovery of complex diseases and the search for therapeutic drugs. However, the increase in data also increases the difficulty of biological networks analysis. Therefore, algorithms that can handle large, heterogeneous and complex data are needed to better analyze the data of these network structures and mine their useful information. Deep learning is a branch of machine learning that extracts more abstract features from a larger set of training data. Through the establishment of an artificial neural network with a network hierarchy structure, deep learning can extract and screen the input information layer by layer and has representation learning ability. The improved deep learning algorithm can be used to process complex and heterogeneous graph data structures and is increasingly being applied to the mining of network data information. In this paper, we first introduce the used network data deep learning models. After words, we summarize the application of deep learning on biological networks. Finally, we discuss the future development prospects of this field.

2014 ◽  
Vol 11 (2) ◽  
pp. 68-79
Author(s):  
Matthias Klapperstück ◽  
Falk Schreiber

Summary The visualization of biological data gained increasing importance in the last years. There is a large number of methods and software tools available that visualize biological data including the combination of measured experimental data and biological networks. With growing size of networks their handling and exploration becomes a challenging task for the user. In addition, scientists also have an interest in not just investigating a single kind of network, but on the combination of different types of networks, such as metabolic, gene regulatory and protein interaction networks. Therefore, fast access, abstract and dynamic views, and intuitive exploratory methods should be provided to search and extract information from the networks. This paper will introduce a conceptual framework for handling and combining multiple network sources that enables abstract viewing and exploration of large data sets including additional experimental data. It will introduce a three-tier structure that links network data to multiple network views, discuss a proof of concept implementation, and shows a specific visualization method for combining metabolic and gene regulatory networks in an example.


2021 ◽  
Vol 11 (12) ◽  
pp. 3044-3053
Author(s):  
Rakesh Kumar Mahendran ◽  
V. Prabhu ◽  
V. Parthasarathy ◽  
A. Mary Judith

Myocardial infarction (MI) may precipitate severe health damage and lead to irreversible death of the heart muscle, the result of prolonged lack of oxygen if it is not treated in a timely manner. Lack of accurate and early detection techniques for this heart disease has reduced the efficiency of MI diagnosis. In this paper, the design, and implementation of an efficient deep learning algorithm called Adaptive Recurrent neural network (ARNN) is proposed for the MI detection. The main objective of the proposed work is the accurate identification of MI disease using ECG signals. ECG signal denoising has been performed using the Multi-Notch filter, which removes the specified noise frequency range. Discrete wavelet transform (DWT) is utilized for performing the feature extraction that decomposes the ECG signal into varied scales with waveletfiltering bank. After the extraction of specific QRS features, classification of the defected and normal ECG arrhythmic beat has been performed using the deep learning-based ARNN classifier. The MIT-BIH database has been used for testing and training data. The performance of the proposed algorithm is evaluated based on classification accuracy. Results that are attained include the classification accuracy of about 99.21%, 99% of sensitivity and 99.4% of specificity with PPV and NPV of about 99.4 and 99.01 values indicate the enhanced performance of our proposed work compared with the conventional LSTM-CAE and LSTM-CNN techniques.


2020 ◽  
pp. 1-17
Author(s):  
Yanhong Yang ◽  
Fleming Y.M. Lure ◽  
Hengyuan Miao ◽  
Ziqi Zhang ◽  
Stefan Jaeger ◽  
...  

Background: Accurate and rapid diagnosis of coronavirus disease (COVID-19) is crucial for timely quarantine and treatment. Purpose: In this study, a deep learning algorithm-based AI model using ResUNet network was developed to evaluate the performance of radiologists with and without AI assistance in distinguishing COVID-19 infected pneumonia patients from other pulmonary infections on CT scans. Methods: For model development and validation, a total number of 694 cases with 111,066 CT slides were retrospectively collected as training data and independent test data in the study. Among them, 118 are confirmed COVID-19 infected pneumonia cases and 576 are other pulmonary infections cases (e.g. tuberculosis cases, common pneumonia cases and non-COVID-19 viral pneumonia cases). The cases were divided into training and testing datasets. The independent test was performed by evaluating and comparing the performance of three radiologists with different years of practice experience in distinguishing COVID-19 infected pneumonia cases with and without the AI assistance. Results: Our final model achieved an overall test accuracy of 0.914 with an area of the receiver operating characteristic (ROC) curve (AUC) of 0.903 in which the sensitivity and specificity are 0.918 and 0.909, respectively. The deep learning-based model then achieved a comparable performance by improving the radiologists’ performance in distinguish COVOD-19 from other pulmonary infections, yielding better average accuracy and sensitivity, from 0.941 to 0.951 and from 0.895 to 0.942, respectively, when compared to radiologists without using AI assistance. Conclusion: A deep learning algorithm-based AI model developed in this study successfully improved radiologists’ performance in distinguishing COVID-19 from other pulmonary infections using chest CT images.


2021 ◽  
Vol 8 (3) ◽  
pp. 619
Author(s):  
Candra Dewi ◽  
Andri Santoso ◽  
Indriati Indriati ◽  
Nadia Artha Dewi ◽  
Yoke Kusuma Arbawa

<p>Semakin meningkatnya jumlah penderita diabetes menjadi salah satu faktor penyebab semakin tingginya penderita penyakit <em>diabetic retinophaty</em>. Salah satu citra yang digunakan oleh dokter mata untuk mengidentifikasi <em>diabetic retinophaty</em> adalah foto retina. Dalam penelitian ini dilakukan pengenalan penyakit diabetic retinophaty secara otomatis menggunakan citra <em>fundus</em> retina dan algoritme <em>Convolutional Neural Network</em> (CNN) yang merupakan variasi dari algoritme Deep Learning. Kendala yang ditemukan dalam proses pengenalan adalah warna retina yang cenderung merah kekuningan sehingga ruang warna RGB tidak menghasilkan akurasi yang optimal. Oleh karena itu, dalam penelitian ini dilakukan pengujian pada berbagai ruang warna untuk mendapatkan hasil yang lebih baik. Dari hasil uji coba menggunakan 1000 data pada ruang warna RGB, HSI, YUV dan L*a*b* memberikan hasil yang kurang optimal pada data seimbang dimana akurasi terbaik masih dibawah 50%. Namun pada data tidak seimbang menghasilkan akurasi yang cukup tinggi yaitu 83,53% pada ruang warna YUV dengan pengujian pada data latih dan akurasi 74,40% dengan data uji pada semua ruang warna.</p><p> </p><p><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Increasing the number of people with diabetes is one of the factors causing the high number of people with diabetic retinopathy. One of the images used by ophthalmologists to identify diabetic retinopathy is a retinal photo. In this research, the identification of diabetic retinopathy is done automatically using retinal fundus images and the Convolutional Neural Network (CNN) algorithm, which is a variation of the Deep Learning algorithm. The obstacle found in the recognition process is the color of the retina which tends to be yellowish red so that the RGB color space does not produce optimal accuracy. Therefore, in this research, various color spaces were tested to get better results. From the results of trials using 1000 images data in the color space of RGB, HSI, YUV and L * a * b * give suboptimal results on balanced data where the best accuracy is still below 50%. However, the unbalanced data gives a fairly high accuracy of 83.53% with training data on the YUV color space and 74,40% with testing data on all color spaces.</em></p><p><em><strong><br /></strong></em></p>


2020 ◽  
Vol 10 (10) ◽  
pp. 2459-2465
Author(s):  
Iftikhar Ahmad ◽  
Muhammad Javed Iqbal ◽  
Mohammad Basheri

The size of data gathered from various ongoing biological and clinically studies is increasing at an exponential rate. The bio-inspired data mainly comprises of genes of DNA, protein and variety of proteomics and genetic diseases. Additionally, DNA microarray data is also available for early diagnosis and prediction of various types of cancer diseases. Interestingly, this data may store very vital information about genes, their structure and important biological function. The huge volume and constant increase in the extracted bio data has opened several challenges. Many bioinformatics and machine learning models have been developed but those fail to address key challenges presents in the efficient and accurate analysis of variety of complex biologically inspired data such as genetic diseases etc. The reliable and robust process of classifying the extracted data into different classes based on the information hidden in the sample data is also a very interesting and open problem. This research work mainly focuses to overcome major challenges in the accurate protein classification keeping in view of the success of deep learning models in natural language processing since it assumes the proteins sequences as a language. The learning ability and overall classification performance of the proposed system can be validated with deep learning classification models. The proposed system can have the superior ability to accurately classify the mentioned datasets than previous approaches and shows better results. The in-depth analysis of multifaceted biological data may also help in the early diagnosis of diseases that causes due to mutation of genes and to overcome arising challenges in the development of large-scale healthcare systems.


Author(s):  
Subasish Das ◽  
Anandi Dutta ◽  
Karen Dixon ◽  
Lisa Minjares-Kyle ◽  
George Gillette

Motorcyclists are vulnerable highway users. Unlike passenger vehicle occupants, motorcycle riders do not have either protective structural surrounding or the advanced restraints that are mandatory safety features in cars and light trucks. Per vehicle mile traveled, motorcyclist fatalities occurred 27 times more frequently than passenger car occupant fatalities in traffic crashes. In addition, there were 4,976 motorcycle crash-related fatalities in the U.S. in 2014—more than twice the number of motorcycle rider fatalities that occurred in 1997. It shows that, in addition to current efforts, research needs to be conducted with additional resources and in newer directions. This paper investigated five years (2010–2014) of Louisiana at-fault motorcycle rider-involved crashes by using deep learning, which is a competent tool for mapping a high-multidimensional input into a smaller multidimensional output. The current study contributes to the existing injury severity modeling literature by developing a deep learning framework, named as DeepScooter, to predict motorcycle-involved crash severities. The final deep learning model can predict severity types with 100% accuracy with training data, and with 94% accuracy with test data, which is not attainable by using a statistical method or machine learning algorithm. The intensity of severities was found to be more likely associated with rider ejection, two-way roadways with no physical separation, curved aligned roadways, and weekends. It is anticipated that the DeepScooter framework and the findings will provide significant contributions to the area of motorcycle safety.


Author(s):  
Mohamed Nadjib Boufenara ◽  
Mahmoud Boufaida ◽  
Mohamed Lamine Berkane

With the exponential growth of biological data, labeling this kind of data becomes difficult and costly. Although unlabeled data are comparatively more plentiful than labeled ones, most supervised learning methods are not designed to use unlabeled data. Semi-supervised learning methods are motivated by the availability of large unlabeled datasets rather than a small amount of labeled examples. However, incorporating unlabeled data into learning does not guarantee an improvement in classification performance. This paper introduces an approach based on a model of semi-supervised learning, which is the self-training with a deep learning algorithm to predict missing classes from labeled and unlabeled data. In order to assess the performance of the proposed approach, two datasets are used with four performance measures: precision, recall, F-measure, and area under the ROC curve (AUC).


2021 ◽  
Vol 54 (3-4) ◽  
pp. 439-445
Author(s):  
Chih-Ta Yen ◽  
Sheng-Nan Chang ◽  
Cheng-Hong Liao

This study used photoplethysmography signals to classify hypertensive into no hypertension, prehypertension, stage I hypertension, and stage II hypertension. There are four deep learning models are compared in the study. The difficulties in the study are how to find the optimal parameters such as kernel, kernel size, and layers in less photoplethysmographyt (PPG) training data condition. PPG signals were used to train deep residual network convolutional neural network (ResNetCNN) and bidirectional long short-term memory (BILSTM) to determine the optimal operating parameters when each dataset consisted of 2100 data points. During the experiment, the proportion of training and testing datasets was 8:2. The model demonstrated an optimal classification accuracy of 76% when the testing dataset was used.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 603
Author(s):  
Bee Hock David Koh ◽  
Chin Leng Peter Lim ◽  
Hasnae Rahimi ◽  
Wai Lok Woo ◽  
Bin Gao

A neural network that matches with a complex data function is likely to boost the classification performance as it is able to learn the useful aspect of the highly varying data. In this work, the temporal context of the time series data is chosen as the useful aspect of the data that is passed through the network for learning. By exploiting the compositional locality of the time series data at each level of the network, shift-invariant features can be extracted layer by layer at different time scales. The temporal context is made available to the deeper layers of the network by a set of data processing operations based on the concatenation operation. A matching learning algorithm for the revised network is described in this paper. It uses gradient routing in the backpropagation path. The framework as proposed in this work attains better generalization without overfitting the network to the data, as the weights can be pretrained appropriately. It can be used end-to-end with multivariate time series data in their raw form, without the need for manual feature crafting or data transformation. Data experiments with electroencephalogram signals and human activity signals show that with the right amount of concatenation in the deeper layers of the proposed network, it can improve the performance in signal classification.


Author(s):  
Rafly Indra Kurnia ◽  
◽  
Abba Suganda Girsang

This study will classify the text based on the rating of the provider application on the Google Play Store. This research is classification of user comments using Word2vec and the deep learning algorithm in this case is Long Short Term Memory (LSTM) based on the rating given with a rating scale of 1-5 with a detailed rating 1 is the lowest and rating 5 is the highest data and a rating scale of 1-3 with a detailed rating, 1 as a negative is a combination of ratings 1 and 2, rating 2 as a neutral is rating 3, and rating 3 as a positive is a combination of ratings 4 and 5 to get sentiment from users using SMOTE oversampling to handle the imbalance data. The data used are 16369 data. The training data and the testing data will be taken from user comments MyTelkomsel’s application from the play.google.com site where each comment has a rating in Indonesian Language. This review data will be very useful for companies to make business decisions. This data can be obtained from social media, but social media does not provide a rating feature for every user comment. This research goal is that data from social media such as Twitter or Facebook can also quickly find out the total of the user satisfaction based from the rating from the comment given. The best f1 scores and precisions obtained using 5 classes with LSTM and SMOTE were 0.62 and 0.70 and the best f1 scores and precisions obtained using 3 classes with LSTM and SMOTE were 0.86 and 0.87


Sign in / Sign up

Export Citation Format

Share Document