graph mining
Recently Published Documents


TOTAL DOCUMENTS

469
(FIVE YEARS 120)

H-INDEX

25
(FIVE YEARS 5)

2022 ◽  
Vol 13 (1) ◽  
pp. 1-28
Author(s):  
Mohammad Ehsan Shahmi Chowdhury ◽  
Chowdhury Farhan Ahmed ◽  
Carson K. Leung

Nowadays graphical datasets are having a vast amount of applications. As a result, graph mining—mining graph datasets to extract frequent subgraphs—has proven to be crucial in numerous aspects. It is important to perform correlation analysis among the subparts (i.e., elements) of the frequent subgraphs generated using graph mining to observe interesting information. However, the majority of existing works focuses on complexities in dealing with graphical structures, and not much work aims to perform correlation analysis. For instance, a previous work realized in this regard, operated with a very naive raw approach to fulfill the objective, but dealt only on a small subset of the problem. Hence, in this article, a new measure is proposed to aid in the analysis for large subgraphs, mined from various types of graph transactions in the dataset. These subgraphs are immense in terms of their structural composition, and thus parallel the entire set of graphs in real-world. A complete framework for discovering the relations among parts of a frequent subgraph is proposed using our new method. Evaluation results show the usefulness and accuracy of the newly defined measure on real-life graphical datasets.


2022 ◽  
Author(s):  
Md Mostafizur Rahman ◽  
Srinivas Mukund Vadrev ◽  
Arturo Magana-Mora ◽  
Jacob Levman ◽  
Othman Soufan

Abstract Food-drug interactions (FDIs) arise when nutritional dietary consumption regulates biochemical mechanisms involved in drug metabolism. Towards characterizing the nature of food’s influence on pharmacological treatment, it is essential to detect all possible FDIs. In this study, we propose FDMine, a novel systematic framework that models the FDI problem as a homogenous graph. In this graph, all nodes representing drug, food and food composition are referenced as chemical structures. This homogenous representation enables us to take advantage of reported drug-drug interactions for accuracy evaluation, especially when accessible ground truth for FDIs is lacking. Our dataset consists of 788 unique approved small molecule drugs with metabolism-related drug-drug interactions (DDIs) and 320 unique food items, composed of 563 unique compounds with 179 health effects. The potential number of interactions is 87,192 and 92,143 when two different versions of the graph referred to as disjoint and joint graphs are considered, respectively. We defined several similarity subnetworks comprising food-drug similarity (FDS), drug-drug similarity (DDS), and food-food similarity (FFS) networks, based on similarity profiles. A unique part of the graph is the encoding of the food composition as a set of nodes and calculating a content contribution score to re-weight the similarity links. To predict new FDI links, we applied the path category-based (path length 2 and 3) and neighborhood-based similarity-based link prediction algorithms. We calculated the precision@top (top 1%, 2%, and 5%) of the newly predicted links, the area under the receiver operating characteristic curve, and precision-recall curve. We have performed three types of evaluations to benchmark results using different types of interactions. The shortest path-based method has achieved a precision 84%, 60% and 40% for the top 1%, 2% and 5% of FDIs identified, respectively. We validated the top FDIs predicted using FDMine to demonstrate its applicability and we relate therapeutic anti-inflammatory effects of food items informed by FDIs. We hypothesize that the proposed framework can be used to gain new insights on FDIs. FDMine is publicly available to support clinicians and researchers.


Author(s):  
Mark Whiting ◽  
Joseph Mettenburg ◽  
Enrico Novelli ◽  
Philip LeDuc ◽  
Jonathan Cagan

Abstract As machine learning is used to make strides in med- ical diagnostics, few methods provide heuristics from which human doctors can learn directly. This work introduces a method for leveraging human observable structures, such as macro scale vascular formations, for producing assessments of medical conditions with rela- tively few training cases, and uncovering patterns that are potential diagnostic aids. The approach draws on shape grammars, a rule-based technique, pioneered in design and architecture, and accelerated through a re- cursive sub-graph mining algorithm. The distribution of rule instances in the data from which they are in- duced is then used as an intermediary representation en- abling common classification and anomaly detection ap- proaches to identify indicative rules with relatively small data sets. The method is applied to 7 Tesla time-of- flight (TOF) angiography MRI (n = 54) of human brain vasculature. The data were segmented and induced to generate representative grammar rules. Ensembles of rules were isolated to implicate vascular conditions reli- ably. This application demonstrates the power of auto- mated structured intermediary representations for as- sessing nuanced biological form relationships, and the strength of shape grammars, in particular for identify- ing indicative patterns in complex vascular networks.


Author(s):  
Georgios Drakopoulos ◽  
Eleanna Kafeza ◽  
Phivos Mylonas ◽  
Spyros Sioutas

2021 ◽  
Author(s):  
Qiang Zhu ◽  
Qinghui Dai ◽  
Bangchao Wang ◽  
Jinxing Liang ◽  
Junping Liu ◽  
...  

Risks ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 224
Author(s):  
Yeftanus Antonio ◽  
Sapto Wahyu Indratno ◽  
Rinovia Simanjuntak

Cyber insurance ratemaking (CIRM) is a procedure used to set rates (or prices) for cyber insurance products provided by insurance companies. Rate estimation is a critical issue for cyber insurance products. This problem arises because of the unavailability of actuarial data and the uncertainty of normative standards of cyber risk. Most cyber risk analyses do not consider the connection between Information Communication and Technology (ICT) sources. Recently, a cyber risk model was developed that considered the network structure. However, the analysis of this model remains limited to an unweighted network. To address this issue, we propose using a graph mining approach (GMA) to CIRM, which can be applied to obtain fair and competitive prices based on weighted network characteristics. This study differs from previous studies in that it adds the GMA to CIRM and uses communication models to explain the frequency of communications as weights in the network. We used the heterogeneous generalized susceptible-infectious-susceptible model to accommodate different infection rates. Our approach adds up to the existing method because it considers the communication frequency and GMA in CIRM. This approach results in heterogeneous premiums. Additionally, GMA can choose more active communications to reflect high communications contribution in the premiums or rates. This contribution is not found when the infection rates are the same. Based on our experimental results, it is apparent that this method can produce more reasonable and competitive prices than other methods. The prices obtained with GMA and communication factors are lower than those obtained without GMA and communication factors.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Chuanting Zhang ◽  
Ke-Ke Shang ◽  
Jingping Qiao

Link prediction is a fundamental problem of data science, which usually calls for unfolding the mechanisms that govern the micro-dynamics of networks. In this regard, using features obtained from network embedding for predicting links has drawn widespread attention. Although methods based on edge features or node similarity have been proposed to solve the link prediction problem, many technical challenges still exist due to the unique structural properties of networks, especially when the networks are sparse. From the graph mining perspective, we first give empirical evidence of the inconsistency between heuristic and learned edge features. Then, we propose a novel link prediction framework, AdaSim, by introducing an Adaptive Similarity function using features obtained from network embedding based on random walks. The node feature representations are obtained by optimizing a graph-based objective function. Instead of generating edge features using binary operators, we perform link prediction solely leveraging the node features of the network. We define a flexible similarity function with one tunable parameter, which serves as a penalty of the original similarity measure. The optimal value is learned through supervised learning and thus is adaptive to data distribution. To evaluate the performance of our proposed algorithm, we conduct extensive experiments on eleven disparate networks of the real world. Experimental results show that AdaSim achieves better performance than state-of-the-art algorithms and is robust to different sparsities of the networks.


2021 ◽  
Author(s):  
Jian Kang ◽  
Hanghang Tong
Keyword(s):  

2021 ◽  
pp. 1-11
Author(s):  
Kekun Hu ◽  
Gang Dong ◽  
Yaqian Zhao ◽  
Rengang Li ◽  
Dongdong Jiang ◽  
...  

Vertex classification is an important graph mining technique and has important applications in fields such as social recommendation and e-Commerce recommendation. Existing classification methods fail to make full use of the graph topology to improve the classification performance. To alleviate it, we propose a Dual Graph Wavelet neural Network composed of two identical graph wavelet neural networks sharing network parameters. These two networks are integrated with a semi-supervised loss function and carry out supervised learning and unsupervised learning on two matrixes representing the graph topology extracted from the same graph dataset, respectively. One matrix embeds the local consistency information and the other the global consistency information. To reduce the computational complexity of the convolution operation of the graph wavelet neural network, we design an approximate scheme based on the first type Chebyshev polynomial. Experimental results show that the proposed network significantly outperforms the state-of-the-art approaches for vertex classification on all three benchmark datasets and the proposed approximation scheme is validated for datasets with low vertex average degree when the approximation order is small.


Sign in / Sign up

Export Citation Format

Share Document