Large scale graph mining for web reputation inference

ABSTRACTMotivationPIWI proteins and Piwi-Interacting RNAs (piRNAs) are commonly detected in human cancers, especially in germline and somatic tissues, and correlates with poorer clinical outcomes, suggesting that they play a functional role in cancer. As the problem of combinatorial explosions between ncRNA and disease exposes out gradually, new bioinformatics methods for large-scale identification and prioritization of potential associations are therefore of interest. However, in the real world, the network of interactions between molecules is enormously intricate and noisy, which poses a problem for efficient graph mining. This study aims to make preliminary attempts on bionetwork based graph mining.ResultsIn this study, we present a method based on graph attention network to identify potential and biologically significant piRNA-disease associations (PDAs), called GAPDA. The attention mechanism can calculate a hidden representation of an association in the network based on neighbor nodes and assign weights to the input to make decisions. In particular, we introduced the attention-based Graph Neural Networks to the field of bio-association prediction for the first time, and proposed an abstract network topology suitable for small samples. Specifically, we combined piRNA sequence information and disease semantic similarity with piRNA-disease association network to construct a new attribute network. In the experiment, GAPDA performed excellently in five-fold cross-validation with the AUC of 0.9038. Not only that, but it still has superior performance compared to methods based on collaborative filtering and attribute features. The experimental results show that GAPDA ensures the prospect of the graph neural network on such problems and can be an excellent supplement for future biomedical [email protected];[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

The power of summarization in graph mining and learning

Proceedings of the VLDB Endowment ◽

10.14778/3484224.3484238 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3416-3416

Author(s):

Danai Koutra

Keyword(s):

Graph Mining ◽

Large Scale ◽

Well Being ◽

Future Research ◽

Natural Processes ◽

Large Scale Data ◽

Challenges And Opportunities ◽

Graph Streams ◽

Computationally Intensive ◽

Graph Neural Networks

Our ability to generate, collect, and archive data related to everyday activities, such as interacting on social media, browsing the web, and monitoring well-being, is rapidly increasing. Getting the most benefit from this large-scale data requires analysis of patterns it contains, which is computationally intensive or even intractable. Summarization techniques produce compact data representations (summaries) that enable faster processing by complex algorithms and queries. This talk will cover summarization of interconnected data (graphs) [3], which can represent a variety of natural processes (e.g., friendships, communication). I will present an overview of my group's work on bridging the gap between research on summarized network representations and real-world problems. Examples include summarization of massive knowledge graphs for refinement [2] and on-device querying [4], summarization of graph streams for persistent activity detection [1], and summarization within graph neural networks for fast, interpretable classification [5]. I will conclude with open challenges and opportunities for future research.

Download Full-text

Large Scale Graph Mining with MapReduce

Advances in Data Mining and Database Management - Graph Data Management ◽

10.4018/978-1-61350-053-8.ch013 ◽

2011 ◽

pp. 299-314

Author(s):

Charalampos E. Tsourakakis

Keyword(s):

Survey Research ◽

Present State ◽

Real World ◽

Graph Mining ◽

Large Scale ◽

State Of The Art ◽

Research Work

In this Chapter, we present state of the art work on large scale graph mining using MapReduce. We survey research work on an important graph mining problem, counting the number of triangles in large-real world networks. We present the most important applications related to the count of triangles and two families of algorithms, a spectral and a combinatorial one, which solve the problem efficiently.

Download Full-text

Large-scale graph mining using backbone refinement classes

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09 ◽

10.1145/1557019.1557089 ◽

2009 ◽

Cited By ~ 13

Author(s):

Andreas Maunz ◽

Christoph Helma ◽

Stefan Kramer

Keyword(s):

Graph Mining ◽

Large Scale

Download Full-text

Large Scale Graph Mining with G-Miner

Proceedings of the 2019 International Conference on Management of Data - SIGMOD '19 ◽

10.1145/3299869.3320219 ◽

2019 ◽

Cited By ~ 1

Author(s):

Hongzhi Chen ◽

Xiaoxi Wang ◽

Chenghuan Huang ◽

Juncheng Fang ◽

Yifan Hou ◽

...

Keyword(s):

Graph Mining ◽

Large Scale

Download Full-text

A fast and robust kernel optimization method for core–periphery detection in directed and weighted graphs

Applied Network Science ◽

10.1007/s41109-019-0173-9 ◽

2019 ◽

Vol 4 (1) ◽

Author(s):

Francesco Tudisco ◽

Desmond J. Higham

Keyword(s):

Social Sciences ◽

Graph Mining ◽

Large Scale ◽

Optimization Problem ◽

Graph Model ◽

Optimization Method ◽

Weighted Graphs ◽

Classification Problems ◽

Random Graph Model ◽

Kernel Optimization

Abstract Many graph mining tasks can be viewed as classification problems on high dimensional data. Within this class we consider the issue of discovering core-periphery structure, which has wide applications in the economic and social sciences. In contrast to many current approaches, we allow for weighted and directed edges and we do not assume that the overall network is connected. Our approach extends recent work on a relevant relaxed nonlinear optimization problem. In the directed, weighted setting, we derive and analyze a globally convergent iterative algorithm. We also relate the algorithm to a maximum likelihood reordering problem on an appropriate core-periphery random graph model. We illustrate the effectiveness of the new algorithm on a large scale directed email network.

Download Full-text

Machine Learning Based Graph Mining of Large-scale Network and Optimization

10.1145/3469213.3470320 ◽

2021 ◽

Author(s):

Mingyue Liu

Keyword(s):

Machine Learning ◽

Graph Mining ◽

Large Scale ◽

Large Scale Network ◽

Scale Network

Download Full-text