scholarly journals Graph Neural Architecture Search

Author(s):  
Yang Gao ◽  
Hong Yang ◽  
Peng Zhang ◽  
Chuan Zhou ◽  
Yue Hu

Graph neural networks (GNNs) emerged recently as a powerful tool for analyzing non-Euclidean data such as social network data. Despite their success, the design of graph neural networks requires heavy manual work and domain knowledge. In this paper, we present a graph neural architecture search method (GraphNAS) that enables automatic design of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and trains the recurrent network with policy gradient to maximize the expected accuracy of the generated architectures on a validation data set. Furthermore, to improve the search efficiency of GraphNAS on big networks, GraphNAS restricts the search space from an entire architecture space to a sequential concatenation of the best search results built on each single architecture layer. Experiments on real-world datasets demonstrate that GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of validation set accuracy. Moreover, in a transfer learning task we observe that graph neural architectures designed by GraphNAS, when transferred to new datasets, still gain improvement in terms of prediction accuracy.

2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Dejun Jiang ◽  
Zhenxing Wu ◽  
Chang-Yu Hsieh ◽  
Guangyong Chen ◽  
Ben Liao ◽  
...  

AbstractGraph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.


2020 ◽  
Author(s):  
Douglas Meneghetti ◽  
Reinaldo Bianchi

This work proposes a neural network architecture that learns policies for multiple agent classes in a heterogeneous multi-agent reinforcement setting. The proposed network uses directed labeled graph representations for states, encodes feature vectors of different sizes for different entity classes, uses relational graph convolution layers to model different communication channels between entity types and learns distinct policies for different agent classes, sharing parameters wherever possible. Results have shown that specializing the communication channels between entity classes is a promising step to achieve higher performance in environments composed of heterogeneous entities.


Healthcare ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 181 ◽  
Author(s):  
Patricia Melin ◽  
Julio Cesar Monica ◽  
Daniela Sanchez ◽  
Oscar Castillo

In this paper, a multiple ensemble neural network model with fuzzy response aggregation for the COVID-19 time series is presented. Ensemble neural networks are composed of a set of modules, which are used to produce several predictions under different conditions. The modules are simple neural networks. Fuzzy logic is then used to aggregate the responses of several predictor modules, in this way, improving the final prediction by combining the outputs of the modules in an intelligent way. Fuzzy logic handles the uncertainty in the process of making a final decision about the prediction. The complete model was tested for the case of predicting the COVID-19 time series in Mexico, at the level of the states and the whole country. The simulation results of the multiple ensemble neural network models with fuzzy response integration show very good predicted values in the validation data set. In fact, the prediction errors of the multiple ensemble neural networks are significantly lower than using traditional monolithic neural networks, in this way showing the advantages of the proposed approach.


2021 ◽  
Vol 2 (4) ◽  
pp. 1-8
Author(s):  
Martin Happ ◽  
Matthias Herlich ◽  
Christian Maier ◽  
Jia Lei Du ◽  
Peter Dorfinger

Modeling communication networks to predict performance such as delay and jitter is important for evaluating and optimizing them. In recent years, neural networks have been used to do this, which may have advantages over existing models, for example from queueing theory. One of these neural networks is RouteNet, which is based on graph neural networks. However, it is based on simplified assumptions. One key simplification is the restriction to a single scheduling policy, which describes how packets of different flows are prioritized for transmission. In this paper we propose a solution that supports multiple scheduling policies (Strict Priority, Deficit Round Robin, Weighted Fair Queueing) and can handle mixed scheduling policies in a single communication network. Our solution is based on the RouteNet architecture as part of the "Graph Neural Network Challenge". We achieved a mean absolute percentage error under 1% with our extended model on the evaluation data set from the challenge. This takes neural-network-based delay estimation one step closer to practical use.


2020 ◽  
Vol 2 (2) ◽  
pp. 32-37
Author(s):  
P. RADIUK ◽  

Over the last decade, a set of machine learning algorithms called deep learning has led to significant improvements in computer vision, natural language recognition and processing. This has led to the widespread use of a variety of commercial, learning-based products in various fields of human activity. Despite this success, the use of deep neural networks remains a black box. Today, the process of setting hyperparameters and designing a network architecture requires experience and a lot of trial and error and is based more on chance than on a scientific approach. At the same time, the task of simplifying deep learning is extremely urgent. To date, no simple ways have been invented to establish the optimal values of learning hyperparameters, namely learning speed, sample size, data set, learning pulse, and weight loss. Grid search and random search of hyperparameter space are extremely resource intensive. The choice of hyperparameters is critical for the training time and the final result. In addition, experts often choose one of the standard architectures (for example, ResNets and ready-made sets of hyperparameters. However, such kits are usually suboptimal for specific practical tasks. The presented work offers an approach to finding the optimal set of hyperparameters of learning ZNM. An integrated approach to all hyperparameters is valuable because there is an interdependence between them. The aim of the work is to develop an approach for setting a set of hyperparameters, which will reduce the time spent during the design of ZNM and ensure the efficiency of its work. In recent decades, the introduction of deep learning methods, in particular convolutional neural networks (CNNs), has led to impressive success in image and video processing. However, the training of CNN has been commonly mostly based on the employment of quasi-optimal hyperparameters. Such an approach usually requires huge computational and time costs to train the network and does not guarantee a satisfactory result. However, hyperparameters play a crucial role in the effectiveness of CNN, as diverse hyperparameters lead to models with significantly different characteristics. Poorly selected hyperparameters generally lead to low model performance. The issue of choosing optimal hyperparameters for CNN has not been resolved yet. The presented work proposes several practical approaches to setting hyperparameters, which allows reducing training time and increasing the accuracy of the model. The article considers the function of training validation loss during underfitting and overfitting. There are guidelines in the end to reach the optimization point. The paper also considers the regulation of learning rate and momentum to accelerate network training. All experiments are based on the widespread CIFAR-10 and CIFAR-100 datasets.


Author(s):  
Jing Huang ◽  
Jie Yang

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.


Information ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 388
Author(s):  
Baocheng Wang ◽  
Wentao Cai

With the rapid increase in the popularity of big data and internet technology, sequential recommendation has become an important method to help people find items they are potentially interested in. Traditional recommendation methods use only recurrent neural networks (RNNs) to process sequential data. Although effective, the results may be unable to capture both the semantic-based preference and the complex transitions between items adequately. In this paper, we model separated session sequences into session graphs and capture complex transitions using graph neural networks (GNNs). We further link items in interaction sequences with existing external knowledge base (KB) entities and integrate the GNN-based recommender with key-value memory networks (KV-MNs) to incorporate KB knowledge. Specifically, we set a key matrix to many relation embeddings that learned from KB, corresponding to many entity attributes, and set up a set of value matrices storing the semantic-based preferences of different users for the corresponding attribute. By using a hybrid of a GNN and KV-MN, each session is represented as the combination of the current interest (i.e., sequential preference) and the global preference (i.e., semantic-based preference) of that session. Extensive experiments on three public real-world datasets show that our method performs better than baseline algorithms consistently.


2020 ◽  
Vol 34 (05) ◽  
pp. 9242-9249
Author(s):  
Yujing Wang ◽  
Yaming Yang ◽  
Yiren Chen ◽  
Jing Bai ◽  
Ce Zhang ◽  
...  

Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.


2005 ◽  
Vol 7 (4) ◽  
pp. 291-296 ◽  
Author(s):  
P. Hettiarachchi ◽  
M. J. Hall ◽  
A. W. Minns

The last decade has seen increasing interest in the application of Artificial Neural Networks (ANNs) for the modelling of the relationship between rainfall and streamflow. Since multi-layer, feed-forward ANNs have the property of being universal approximators, they are able to capture the essence of most input–output relationships, provided that an underlying deterministic relationship exists. Unfortunately, owing to the standardisation of inputs and outputs that is required to run ANNs, a problem arises in extrapolation: if the training data set does not contain the maximum possible output value, an unmodified network will be unable to synthesise this peak value. The occurrence of high magnitude, low frequency events within short periods of record is largely fortuitous. Therefore, the confidence in the neural network model can be greatly enhanced if some methodology can be found for incorporating domain knowledge about such events into the calibration and verification procedure in addition to the available measured data sets. One possible form of additional domain knowledge is the Estimated Maximum Flood (EMF), a notional event with a small but non-negligible probability of exceedence. This study investigates the suitability of including an EMF estimate in the training set of a rainfall–runoff ANN in order to improve the extrapolation characteristics of the network. A study has been carried out in which EMFs have been included, along with recorded flood events, in the training of ANN models for six catchments in the south west of England. The results demonstrate that, with prior transformation of the runoff data to logarithms of flows, the inclusion of domain knowledge in the form of such extreme synthetic events improves the generalisation capabilities of the ANN model and does not disrupt the training process. Where guidelines are available for EMF estimation, the application of this approach is recommended as an alternative means of overcoming the inherent extrapolation problems of multi-layer, feed-forward ANNs.


Sign in / Sign up

Export Citation Format

Share Document