scholarly journals A Degeneracy Framework for Graph Similarity

Author(s):  
Giannis Nikolentzos ◽  
Polykarpos Meladianos ◽  
Stratis Limnios ◽  
Michalis Vazirgiannis

The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different structure at different scales. In this paper, we present a general framework for graph similarity which takes into account structure at multiple different scales. The proposed framework capitalizes on the well-known k-core decomposition of graphs in order to build a hierarchy of nested subgraphs. We apply the framework to derive variants of four graph kernels, namely graphlet kernel, shortest-path kernel, Weisfeiler-Lehman subtree kernel, and pyramid match graph kernel. The framework is not limited to graph kernels, but can be applied to any graph comparison algorithm. The proposed framework is evaluated on several benchmark datasets for graph classification. In most cases, the core-based kernels achieve significant improvements in terms of classification accuracy over the base kernels, while their time complexity remains very attractive.

Author(s):  
Frances Cassidy ◽  
Margee Hume

Core and peripheral destinations are very significant to island tourism because of core and peripheral islands. Peripheral locations may be disadvantaged as they are isolated from the core or economic centers and from the main population. This chapter reviews literature on the complexity of core and peripheral destinations, their development, planning, marketing and management together with local resident's perceptions of tourists and the tourist's expectations. The South Pacific is defined and it's Colonial past discussed together with tourist motivations. It is becoming increasingly difficult for all stakeholders to agree on programs and tourism practices and that various South Pacific countries have different ways of collecting statistical data resulting in few generic standards to adhere to.


2019 ◽  
pp. 295-312
Author(s):  
Frances Cassidy ◽  
Margee Hume

Core and peripheral destinations are very significant to island tourism because of core and peripheral islands. Peripheral locations may be disadvantaged as they are isolated from the core or economic centers and from the main population. This chapter reviews literature on the complexity of core and peripheral destinations, their development, planning, marketing and management together with local resident's perceptions of tourists and the tourist's expectations. The South Pacific is defined and it's Colonial past discussed together with tourist motivations. It is becoming increasingly difficult for all stakeholders to agree on programs and tourism practices and that various South Pacific countries have different ways of collecting statistical data resulting in few generic standards to adhere to.


Entropy ◽  
2018 ◽  
Vol 20 (12) ◽  
pp. 984 ◽  
Author(s):  
Yi Zhang ◽  
Lulu Wang ◽  
Liandong Wang

Graph kernels are of vital importance in the field of graph comparison and classification. However, how to compare and evaluate graph kernels and how to choose an optimal kernel for a practical classification problem remain open problems. In this paper, a comprehensive evaluation framework of graph kernels is proposed for unattributed graph classification. According to the kernel design methods, the whole graph kernel family can be categorized in five different dimensions, and then several representative graph kernels are chosen from these categories to perform the evaluation. With plenty of real-world and synthetic datasets, kernels are compared by many criteria such as classification accuracy, F1 score, runtime cost, scalability and applicability. Finally, quantitative conclusions are discussed based on the analyses of the extensive experimental results. The main contribution of this paper is that a comprehensive evaluation framework of graph kernels is proposed, which is significant for graph-classification applications and the future kernel research.


2019 ◽  
Vol 10 ◽  
Author(s):  
Debleena Guin ◽  
Jyoti Rani ◽  
Priyanka Singh ◽  
Sandeep Grover ◽  
Shivangi Bora ◽  
...  

Understanding patients’ genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype–phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease–drug–gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.


Author(s):  
Niloofer Shanavas ◽  
Hui Wang ◽  
Zhiwei Lin ◽  
Glenn Hawe

AbstractAutomatic text classification using machine learning is significantly affected by the text representation model. The structural information in text is necessary for natural language understanding, which is usually ignored in vector-based representations. In this paper, we present a graph kernel-based text classification framework which utilises the structural information in text effectively through the weighting and enrichment of a graph-based representation. We introduce weighted co-occurrence graphs to represent text documents, which weight the terms and their dependencies based on their relevance to text classification. We propose a novel method to automatically enrich the weighted graphs using semantic knowledge in the form of a word similarity matrix. The similarity between enriched graphs, knowledge-driven graph similarity, is calculated using a graph kernel. The semantic knowledge in the enriched graphs ensures that the graph kernel goes beyond exact matching of terms and patterns to compute the semantic similarity of documents. In the experiments on sentiment classification and topic classification tasks, our knowledge-driven similarity measure significantly outperforms the baseline text similarity measures on five benchmark text classification datasets.


Algorithms ◽  
2019 ◽  
Vol 12 (11) ◽  
pp. 223 ◽  
Author(s):  
Alessio Martino ◽  
Alessandro Giuliani ◽  
Antonello Rizzi

This paper investigates a novel graph embedding procedure based on simplicial complexes. Inherited from algebraic topology, simplicial complexes are collections of increasing-order simplices (e.g., points, lines, triangles, tetrahedrons) which can be interpreted as possibly meaningful substructures (i.e., information granules) on the top of which an embedding space can be built by means of symbolic histograms. In the embedding space, any Euclidean pattern recognition system can be used, possibly equipped with feature selection capabilities in order to select the most informative symbols. The selected symbols can be analysed by field-experts in order to extract further knowledge about the process to be modelled by the learning system, hence the proposed modelling strategy can be considered as a grey-box. The proposed embedding has been tested on thirty benchmark datasets for graph classification and, further, we propose two real-world applications, namely predicting proteins’ enzymatic function and solubility propensity starting from their 3D structure in order to give an example of the knowledge discovery phase which can be carried out starting from the proposed embedding strategy.


Author(s):  
KASPAR RIESEN ◽  
HORST BUNKE

Graphs provide us with a powerful and flexible representation formalism for pattern classification. Many classification algorithms have been proposed in the literature. However, the vast majority of these algorithms rely on vectorial data descriptions and cannot directly be applied to graphs. Recently, a growing interest in graph kernel methods can be observed. Graph kernels aim at bridging the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. In the present paper, we propose an approach transforming graphs into n-dimensional real vectors by means of prototype selection and graph edit distance computation. This approach allows one to build graph kernels in a straightforward way. It is not only applicable to graphs, but also to other kind of symbolic data in conjunction with any kind of dissimilarity measure. Thus it is characterized by a high degree of flexibility. With several experimental results, we prove the robustness and flexibility of our new method and show that our approach outperforms other graph classification methods on several graph data sets of diverse nature.


2021 ◽  
Author(s):  
Sarah Seus ◽  
◽  
Susanne Buehrer

This article is based on the evaluation of the German research funding programme “FONA - Forschung für Nachhaltigkeit” (Research for Sustainability.) It reflects upon the methodological challenges confronting the evaluation. These challenges result from the specific objectives and design of the FONA programme (a strategic portfolio of heterogenious interventions). FONA’s ambition is to fund activities under the emerging field of ‘sustainability research’. The core characteristics of sustainability research are: interdisciplinary and trans-disciplinary research processes; orientation towards transferring the research results (into society) and the interdependency with a wider system and global perspective.


Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1155
Author(s):  
Alessio Martino ◽  
Antonello Rizzi

Graph kernels are one of the mainstream approaches when dealing with measuring similarity between graphs, especially for pattern recognition and machine learning tasks. In turn, graphs gained a lot of attention due to their modeling capabilities for several real-world phenomena ranging from bioinformatics to social network analysis. However, the attention has been recently moved towards hypergraphs, generalization of plain graphs where multi-way relations (other than pairwise relations) can be considered. In this paper, four (hyper)graph kernels are proposed and their efficiency and effectiveness are compared in a twofold fashion. First, by inferring the simplicial complexes on the top of underlying graphs and by performing a comparison among 18 benchmark datasets against state-of-the-art approaches; second, by facing a real-world case study (i.e., metabolic pathways classification) where input data are natively represented by hypergraphs. With this work, we aim at fostering the extension of graph kernels towards hypergraphs and, more in general, bridging the gap between structural pattern recognition and the domain of hypergraphs.


Author(s):  
Jun Guo ◽  
Jiahui Ye

Clustering on multi-view data has attracted much more attention in the past decades. Most previous studies assume that each instance appears in all views, or there is at least one view containing all instances. However, real world data often suffers from missing some instances in each view, leading to the research problem of partial multi-view clustering. To address this issue, this paper proposes a simple yet effective Anchorbased Partial Multi-view Clustering (APMC) method, which utilizes anchors to reconstruct instance-to-instance relationships for clustering. APMC is conceptually simple and easy to implement in practice, besides it has clear intuitions and non-trivial empirical guarantees. Specifically, APMC firstly integrates intra- and inter- view similarities through anchors. Then, spectral clustering is performed on the fused similarities to obtain a unified clustering result. Compared with existing partial multi-view clustering methods, APMC has three notable advantages: 1) it can capture more non-linear relations among instances with the help of kernel-based similarities; 2) it has a much lower time complexity in virtue of a noniterative scheme; 3) it can inherently handle data with negative entries as well as be extended to more than two views. Finally, we extensively evaluate the proposed method on five benchmark datasets. Experimental results demonstrate the superiority of APMC over state-of-the-art approaches.


Sign in / Sign up

Export Citation Format

Share Document