Estimating latent positions of actors using Neural Networks in R with GCN4R

AbstractNetwork analysis methods are useful to better understand and contextualize relationships between entities. While statistical and machine learning prediction models generally assume independence between actors, network-based statistical methods for social network data allow for dyadic dependence between actors. While numerous methods have been developed for the R statistical software to analyze such data, deep learning methods have not been implemented in this language. Here, we introduce GCN4R, an R library for fitting graph neural networks on independent networks to aggregate actor covariate information to yield meaningful embeddings for a variety of network-based tasks (e.g. community detection, peer effects models, social influence). We provide an extensive overview of insights and methods utilized by the deep learning community on learning on social and biological networks, followed by a tutorial that demonstrates some of the capabilities of the GCN4R framework to make these methods more accessible to the R research community.

Download Full-text

Biological network analysis with deep learning

Briefings in Bioinformatics ◽

10.1093/bib/bbaa257 ◽

2020 ◽

Author(s):

Giulia Muzio ◽

Leslie O’Bray ◽

Karsten Borgwardt

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Biological Networks ◽

Protein Function ◽

Regulatory Networks ◽

Biological Network ◽

Gene Interaction ◽

Disease Diagnosis ◽

Interaction Prediction ◽

Graph Neural Networks

Abstract Recent advancements in experimental high-throughput technologies have expanded the availability and quantity of molecular data in biology. Given the importance of interactions in biological processes, such as the interactions between proteins or the bonds within a chemical compound, this data is often represented in the form of a biological network. The rise of this data has created a need for new computational tools to analyze networks. One major trend in the field is to use deep learning for this goal and, more specifically, to use methods that work with networks, the so-called graph neural networks (GNNs). In this article, we describe biological networks and review the principles and underlying algorithms of GNNs. We then discuss domains in bioinformatics in which graph neural networks are frequently being applied at the moment, such as protein function prediction, protein–protein interaction prediction and in silico drug discovery and development. Finally, we highlight application areas such as gene regulatory networks and disease diagnosis where deep learning is emerging as a new tool to answer classic questions like gene interaction prediction and automatic disease prediction from data.

Download Full-text

Latency Estimation Tool and Investigation of Neural Networks Inference on Mobile GPU

Computers ◽

10.3390/computers10080104 ◽

2021 ◽

Vol 10 (8) ◽

pp. 104

Author(s):

Evgeny Ponomarev ◽

Sergey Matveev ◽

Ivan Oseledets ◽

Valery Glukhov

Keyword(s):

Neural Network ◽

Experimental Data ◽

Neural Networks ◽

Deep Learning ◽

Network Inference ◽

Prediction Models ◽

Specific Problem ◽

Research Community ◽

Specific Task ◽

Mobile Gpu

A lot of deep learning applications are desired to be run on mobile devices. Both accuracy and inference time are meaningful for a lot of them. While the number of FLOPs is usually used as a proxy for neural network latency, it may not be the best choice. In order to obtain a better approximation of latency, the research community uses lookup tables of all possible layers for the calculation of the inference on a mobile CPU. It requires only a small number of experiments. Unfortunately, on a mobile GPU, this method is not applicable in a straightforward way and shows low precision. In this work, we consider latency approximation on a mobile GPU as a data- and hardware-specific problem. Our main goal is to construct a convenient Latency Estimation Tool for Investigation (LETI) of neural network inference and building robust and accurate latency prediction models for each specific task. To achieve this goal, we make tools that provide a convenient way to conduct massive experiments on different target devices focusing on a mobile GPU. After evaluation of the dataset, one can train the regression model on experimental data and use it for future latency prediction and analysis. We experimentally demonstrate the applicability of such an approach on a subset of the popular NAS-Benchmark 101 dataset for two different mobile GPU.

Download Full-text

Financial Information Asymmetry: Using Deep Learning Algorithms to Predict Financial Distress

Symmetry ◽

10.3390/sym13030443 ◽

2021 ◽

Vol 13 (3) ◽

pp. 443

Author(s):

Chyan-long Jan

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Information Asymmetry ◽

Error Rate ◽

Financial Distress ◽

Prediction Models ◽

High Accuracy ◽

Financial Information ◽

Financial Distress Prediction ◽

Distress Prediction

Because of the financial information asymmetry, the stakeholders usually do not know a company’s real financial condition until financial distress occurs. Financial distress not only influences a company’s operational sustainability and damages the rights and interests of its stakeholders, it may also harm the national economy and society; hence, it is very important to build high-accuracy financial distress prediction models. The purpose of this study is to build high-accuracy and effective financial distress prediction models by two representative deep learning algorithms: Deep neural networks (DNN) and convolutional neural networks (CNN). In addition, important variables are selected by the chi-squared automatic interaction detector (CHAID). In this study, the data of Taiwan’s listed and OTC sample companies are taken from the Taiwan Economic Journal (TEJ) database during the period from 2000 to 2019, including 86 companies in financial distress and 258 not in financial distress, for a total of 344 companies. According to the empirical results, with the important variables selected by CHAID and modeling by CNN, the CHAID-CNN model has the highest financial distress prediction accuracy rate of 94.23%, and the lowest type I error rate and type II error rate, which are 0.96% and 4.81%, respectively.

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Beyond Deep Learning: An Econometric Example

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488520400036 ◽

2020 ◽

Vol 28 (Supp01) ◽

pp. 31-38 ◽

Cited By ~ 1

Author(s):

Ruofan Liao ◽

Paravee Maneejuk ◽

Songsak Sriboonchitta

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Prediction Models ◽

Original Data ◽

Parametric Model ◽

Parametric Models ◽

The Past ◽

Currency Exchange ◽

Currency Exchange Rate ◽

Linear And Nonlinear

In the past, in many areas, the best prediction models were linear and nonlinear parametric models. In the last decade, in many application areas, deep learning has shown to lead to more accurate predictions than the parametric models. Deep learning-based predictions are reasonably accurate, but not perfect. How can we achieve better accuracy? To achieve this objective, we propose to combine neural networks with parametric model: namely, to train neural networks not on the original data, but on the differences between the actual data and the predictions of the parametric model. On the example of predicting currency exchange rate, we show that this idea indeed leads to more accurate predictions.

Download Full-text

Abstract P437: Deep Learning-based Models For Complete Atrioventricular Block Heart Rhythm Analysis

Circulation Research ◽

10.1161/res.129.suppl_1.p437 ◽

2021 ◽

Vol 129 (Suppl_1) ◽

Author(s):

Dahim Choi ◽

Nam Kyun Kim ◽

Young H Son ◽

Yuming Gao ◽

Christina Sheng ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Atrioventricular Block ◽

Prediction Models ◽

Pacemaker Implantation ◽

Heart Rhythm ◽

High Sensitivity ◽

Ventricular Myocardium ◽

True Positive Rate ◽

Starting Point

Atrioventricular block (AVB), caused by impairment in the heart conduction system, presents extreme diversity and is associated with other complications. Only half of AVB patients require a permanent pacemaker, and the process determining the pacemaker implantation is associated with an increase in cost and patient morbidity and mortality. Thus, there is a need for models capable of accurately identifying transient or reversible causes for conduction disturbances and predicting the patient risks and the necessity of a pacemaker. Deep learning (DL) is brought to the forefront due to its prediction accuracy, and the DL-based electrocardiogram (ECG) analysis can be a breakthrough to analyze a massive amount of data. However, the current DL models are unsuitable for AVB-ECG, where the P waves are decoupled from the QRS/T waves, and a black-box nature of the DL-based model lowers the credibility of prediction models to physicians. Here, we present a real-time-capable DL-based algorithm that can identify AVB-ECG waves and automate AVB phenotyping for arrhythmogenic risk assessment. Our algorithm can analyze unformatted ECG records with abnormal patterns by integrating the two representative DL algorithms: convolutional neural networks (CNN) and recurrent neural networks (RNN). This hybrid CNN/RNN network can memorize local patterns, spatial hierarchies, and long-range temporal dependencies of ECG signals. Furthermore, by integrating parameters derived from dimension reduction analysis and heart rate variability into the hybrid layers, the algorithm can capture the P/QRS/T-specific morphological and temporal features in ECG waveforms. We evaluated the algorithm using the six AVB porcine models, where TBX18, a pacemaker transcription factor, was transduced into the ventricular myocardium to form a biological pacemaker, and an additional electronic pacemaker was transplanted as a backup pacemaker. We achieved high sensitivity (95% true positive rate) and quantified the potential risks of various pathological ECG patterns. This study may be a starting point in conducting both retrospective and prospective patient studies and will help physicians understand its decision-making workflow and find the incorrect recommendations for AVB patients.

Download Full-text

Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models

Journal of Cheminformatics ◽

10.1186/s13321-020-00479-8 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Dejun Jiang ◽

Zhenxing Wu ◽

Chang-Yu Hsieh ◽

Guangyong Chen ◽

Ben Liao ◽

...

Keyword(s):

Neural Networks ◽

Computational Efficiency ◽

Domain Knowledge ◽

Prediction Models ◽

Computational Cost ◽

Large Dataset ◽

Predictive Capacity ◽

Classification Tasks ◽

Graph Neural Networks ◽

Public Datasets

AbstractGraph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.

Download Full-text

Learners Demographics Classification on MOOCs During the COVID-19: Author Profiling via Deep Learning Based on Semantic and Syntactic Representations

Frontiers in Research Metrics and Analytics ◽

10.3389/frma.2021.673928 ◽

2021 ◽

Vol 6 ◽

Author(s):

Tahani Aljohani ◽

Alexandra I. Cristea

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Prediction Models ◽

Short Term Memory ◽

Methodological Approach ◽

High Accuracy ◽

Directional Model ◽

Textual Representations ◽

The One

Massive Open Online Courses (MOOCs) have become universal learning resources, and the COVID-19 pandemic is rendering these platforms even more necessary. In this paper, we seek to improve Learner Profiling (LP), i.e. estimating the demographic characteristics of learners in MOOC platforms. We have focused on examining models which show promise elsewhere, but were never examined in the LP area (deep learning models) based on effective textual representations. As LP characteristics, we predict here the employment status of learners. We compare sequential and parallel ensemble deep learning architectures based on Convolutional Neural Networks and Recurrent Neural Networks, obtaining an average high accuracy of 96.3% for our best method. Next, we predict the gender of learners based on syntactic knowledge from the text. We compare different tree-structured Long-Short-Term Memory models (as state-of-the-art candidates) and provide our novel version of a Bi-directional composition function for existing architectures. In addition, we evaluate 18 different combinations of word-level encoding and sentence-level encoding functions. Based on these results, we show that our Bi-directional model outperforms all other models and the highest accuracy result among our models is the one based on the combination of FeedForward Neural Network and the Stack-augmented Parser-Interpreter Neural Network (82.60% prediction accuracy). We argue that our prediction models recommended for both demographics characteristics examined in this study can achieve high accuracy. This is additionally also the first time a sound methodological approach toward improving accuracy for learner demographics classification on MOOCs was proposed.

Download Full-text

Deep Learning Strategies for ProtoDUNE Raw Data Denoising

Computing and Software for Big Science ◽

10.1007/s41781-021-00077-9 ◽

2022 ◽

Vol 6 (1) ◽

Author(s):

Marco Rossi ◽

Sofia Vallecorsa

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Learning Strategies ◽

Simulation Data ◽

Raw Data ◽

Digital Detector ◽

Speed Up ◽

Neural Network Hardware ◽

Graph Neural Networks ◽

High Level

AbstractIn this work, we investigate different machine learning-based strategies for denoising raw simulation data from the ProtoDUNE experiment. The ProtoDUNE detector is hosted by CERN and it aims to test and calibrate the technologies for DUNE, a forthcoming experiment in neutrino physics. The reconstruction workchain consists of converting digital detector signals into physical high-level quantities. We address the first step in reconstruction, namely raw data denoising, leveraging deep learning algorithms. We design two architectures based on graph neural networks, aiming to enhance the receptive field of basic convolutional neural networks. We benchmark this approach against traditional algorithms implemented by the DUNE collaboration. We test the capabilities of graph neural network hardware accelerator setups to speed up training and inference processes.

Download Full-text

A Graph Feature Auto-Encoder for the prediction of unobserved node features on biological networks

BMC Bioinformatics ◽

10.1186/s12859-021-04447-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ramin Hasibi ◽

Tom Michoel

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Gene Expression Data ◽

Biological Networks ◽

Molecular Interaction ◽

Interaction Networks ◽

Omics Data ◽

Expression Data ◽

Molecular Interaction Networks ◽

Graph Neural Networks

Abstract Background Molecular interaction networks summarize complex biological processes as graphs, whose structure is informative of biological function at multiple scales. Simultaneously, omics technologies measure the variation or activity of genes, proteins, or metabolites across individuals or experimental conditions. Integrating the complementary viewpoints of biological networks and omics data is an important task in bioinformatics, but existing methods treat networks as discrete structures, which are intrinsically difficult to integrate with continuous node features or activity measures. Graph neural networks map graph nodes into a low-dimensional vector space representation, and can be trained to preserve both the local graph structure and the similarity between node features. Results We studied the representation of transcriptional, protein–protein and genetic interaction networks in E. coli and mouse using graph neural networks. We found that such representations explain a large proportion of variation in gene expression data, and that using gene expression data as node features improves the reconstruction of the graph from the embedding. We further proposed a new end-to-end Graph Feature Auto-Encoder framework for the prediction of node features utilizing the structure of the gene networks, which is trained on the feature prediction task, and showed that it performs better at predicting unobserved node features than regular MultiLayer Perceptrons. When applied to the problem of imputing missing data in single-cell RNAseq data, the Graph Feature Auto-Encoder utilizing our new graph convolution layer called FeatGraphConv outperformed a state-of-the-art imputation method that does not use protein interaction information, showing the benefit of integrating biological networks and omics data with our proposed approach. Conclusion Our proposed Graph Feature Auto-Encoder framework is a powerful approach for integrating and exploiting the close relation between molecular interaction networks and functional genomics data.

Download Full-text