A Degeneracy Framework for Graph Similarity

The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different structure at different scales. In this paper, we present a general framework for graph similarity which takes into account structure at multiple different scales. The proposed framework capitalizes on the well-known k-core decomposition of graphs in order to build a hierarchy of nested subgraphs. We apply the framework to derive variants of four graph kernels, namely graphlet kernel, shortest-path kernel, Weisfeiler-Lehman subtree kernel, and pyramid match graph kernel. The framework is not limited to graph kernels, but can be applied to any graph comparison algorithm. The proposed framework is evaluated on several benchmark datasets for graph classification. In most cases, the core-based kernels achieve significant improvements in terms of classification accuracy over the base kernels, while their time complexity remains very attractive.

Download Full-text

Advancing the Global Perspective of Tourism by Examining Core and Peripheral Destinations

Handbook of Research on Global Hospitality and Tourism Management - Advances in Hospitality, Tourism, and the Services Industry ◽

10.4018/978-1-4666-8606-9.ch004 ◽

2015 ◽

pp. 37-53

Author(s):

Frances Cassidy ◽

Margee Hume

Keyword(s):

Statistical Data ◽

South Pacific ◽

Development Planning ◽

Global Perspective ◽

The South ◽

The Core ◽

Main Population

Core and peripheral destinations are very significant to island tourism because of core and peripheral islands. Peripheral locations may be disadvantaged as they are isolated from the core or economic centers and from the main population. This chapter reviews literature on the complexity of core and peripheral destinations, their development, planning, marketing and management together with local resident's perceptions of tourists and the tourist's expectations. The South Pacific is defined and it's Colonial past discussed together with tourist motivations. It is becoming increasingly difficult for all stakeholders to agree on programs and tourism practices and that various South Pacific countries have different ways of collecting statistical data resulting in few generic standards to adhere to.

Download Full-text

Advancing the Global Perspective of Tourism by Examining Core and Peripheral Destinations

Sustainable Tourism ◽

10.4018/978-1-5225-7504-7.ch016 ◽

2019 ◽

pp. 295-312

Author(s):

Frances Cassidy ◽

Margee Hume

Keyword(s):

Statistical Data ◽

South Pacific ◽

Development Planning ◽

Global Perspective ◽

The South ◽

The Core ◽

Main Population

Download Full-text

A Comprehensive Evaluation of Graph Kernels for Unattributed Graphs

Entropy ◽

10.3390/e20120984 ◽

2018 ◽

Vol 20 (12) ◽

pp. 984 ◽

Cited By ~ 2

Author(s):

Yi Zhang ◽

Lulu Wang ◽

Liandong Wang

Keyword(s):

Comprehensive Evaluation ◽

Classification Problem ◽

Evaluation Framework ◽

Graph Classification ◽

Open Problems ◽

Graph Kernels ◽

Different Dimensions ◽

Synthetic Datasets ◽

Optimal Kernel ◽

Kernel Design

Graph kernels are of vital importance in the field of graph comparison and classification. However, how to compare and evaluate graph kernels and how to choose an optimal kernel for a practical classification problem remain open problems. In this paper, a comprehensive evaluation framework of graph kernels is proposed for unattributed graph classification. According to the kernel design methods, the whole graph kernel family can be categorized in five different dimensions, and then several representative graph kernels are chosen from these categories to perform the evaluation. With plenty of real-world and synthetic datasets, kernels are compared by many criteria such as classification accuracy, F1 score, runtime cost, scalability and applicability. Finally, quantitative conclusions are discussed based on the analyses of the extensive experimental results. The main contribution of this paper is that a comprehensive evaluation framework of graph kernels is proposed, which is significant for graph-classification applications and the future kernel research.

Download Full-text

Global Text Mining and Development of Pharmacogenomic Knowledge Resource for Precision Medicine

Frontiers in Pharmacology ◽

10.3389/fphar.2019.00839 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 2

Author(s):

Debleena Guin ◽

Jyoti Rani ◽

Priyanka Singh ◽

Sandeep Grover ◽

Shivangi Bora ◽

...

Keyword(s):

Text Mining ◽

Drug Response ◽

Clinical Trial Data ◽

Response Prediction ◽

R Package ◽

Mendelian Inheritance ◽

Global Perspective ◽

Manual Curation ◽

The Us ◽

Benchmark Datasets

Understanding patients’ genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype–phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease–drug–gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.

Download Full-text

Knowledge-driven graph similarity for text classification

International Journal of Machine Learning and Cybernetics ◽

10.1007/s13042-020-01221-4 ◽

2020 ◽

Author(s):

Niloofer Shanavas ◽

Hui Wang ◽

Zhiwei Lin ◽

Glenn Hawe

Keyword(s):

Text Classification ◽

Structural Information ◽

Similarity Measures ◽

Semantic Knowledge ◽

Exact Matching ◽

Text Documents ◽

Graph Kernel ◽

Word Similarity ◽

Graph Similarity ◽

Automatic Text Classification

AbstractAutomatic text classification using machine learning is significantly affected by the text representation model. The structural information in text is necessary for natural language understanding, which is usually ignored in vector-based representations. In this paper, we present a graph kernel-based text classification framework which utilises the structural information in text effectively through the weighting and enrichment of a graph-based representation. We introduce weighted co-occurrence graphs to represent text documents, which weight the terms and their dependencies based on their relevance to text classification. We propose a novel method to automatically enrich the weighted graphs using semantic knowledge in the form of a word similarity matrix. The similarity between enriched graphs, knowledge-driven graph similarity, is calculated using a graph kernel. The semantic knowledge in the enriched graphs ensures that the graph kernel goes beyond exact matching of terms and patterns to compute the semantic similarity of documents. In the experiments on sentiment classification and topic classification tasks, our knowledge-driven similarity measure significantly outperforms the baseline text similarity measures on five benchmark text classification datasets.

Download Full-text

(Hyper)Graph Embedding and Classification via Simplicial Complexes

Algorithms ◽

10.3390/a12110223 ◽

2019 ◽

Vol 12 (11) ◽

pp. 223 ◽

Cited By ~ 10

Author(s):

Alessio Martino ◽

Alessandro Giuliani ◽

Antonello Rizzi

Keyword(s):

Algebraic Topology ◽

3D Structure ◽

Graph Embedding ◽

Recognition System ◽

Simplicial Complexes ◽

Learning System ◽

Graph Classification ◽

Enzymatic Function ◽

Benchmark Datasets ◽

Embedding Strategy

This paper investigates a novel graph embedding procedure based on simplicial complexes. Inherited from algebraic topology, simplicial complexes are collections of increasing-order simplices (e.g., points, lines, triangles, tetrahedrons) which can be interpreted as possibly meaningful substructures (i.e., information granules) on the top of which an embedding space can be built by means of symbolic histograms. In the embedding space, any Euclidean pattern recognition system can be used, possibly equipped with feature selection capabilities in order to select the most informative symbols. The selected symbols can be analysed by field-experts in order to extract further knowledge about the process to be modelled by the learning system, hence the proposed modelling strategy can be considered as a grey-box. The proposed embedding has been tested on thirty benchmark datasets for graph classification and, further, we propose two real-world applications, namely predicting proteins’ enzymatic function and solubility propensity starting from their 3D structure in order to give an example of the knowledge discovery phase which can be carried out starting from the proposed embedding strategy.

Download Full-text

GRAPH CLASSIFICATION BASED ON VECTOR SPACE EMBEDDING

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800140900748x ◽

2009 ◽

Vol 23 (06) ◽

pp. 1053-1081 ◽

Cited By ~ 48

Author(s):

KASPAR RIESEN ◽

HORST BUNKE

Keyword(s):

Data Sets ◽

Object Representations ◽

Graph Classification ◽

Graph Edit Distance ◽

Graph Kernels ◽

Symbolic Data ◽

Representational Power ◽

High Degree ◽

Vectorial Data ◽

Representation Formalism

Graphs provide us with a powerful and flexible representation formalism for pattern classification. Many classification algorithms have been proposed in the literature. However, the vast majority of these algorithms rely on vectorial data descriptions and cannot directly be applied to graphs. Recently, a growing interest in graph kernel methods can be observed. Graph kernels aim at bridging the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. In the present paper, we propose an approach transforming graphs into n-dimensional real vectors by means of prototype selection and graph edit distance computation. This approach allows one to build graph kernels in a straightforward way. It is not only applicable to graphs, but also to other kind of symbolic data in conjunction with any kind of dissimilarity measure. Thus it is characterized by a high degree of flexibility. With several experimental results, we prove the robustness and flexibility of our new method and show that our approach outperforms other graph classification methods on several graph data sets of diverse nature.

Download Full-text

How to Evaluate a Transition-Oriented Funding Programme? Lessons Learned from the Evaluation of FONA, the German Framework Programme to Promote Sustainability Research

10.22163/fteval.2021.515 ◽

2021 ◽

Author(s):

Sarah Seus ◽

◽

Susanne Buehrer

Keyword(s):

Research Funding ◽

Lessons Learned ◽

Global Perspective ◽

Framework Programme ◽

Research Results ◽

The Core ◽

Sustainability Research ◽

German Research ◽

Funding Programme ◽

Core Characteristics

This article is based on the evaluation of the German research funding programme “FONA - Forschung für Nachhaltigkeit” (Research for Sustainability.) It reflects upon the methodological challenges confronting the evaluation. These challenges result from the specific objectives and design of the FONA programme (a strategic portfolio of heterogenious interventions). FONA’s ambition is to fund activities under the emerging field of ‘sustainability research’. The core characteristics of sustainability research are: interdisciplinary and trans-disciplinary research processes; orientation towards transferring the research results (into society) and the interdependency with a wider system and global perspective.

Download Full-text

(Hyper)graph Kernels over Simplicial Complexes

Entropy ◽

10.3390/e22101155 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1155

Author(s):

Alessio Martino ◽

Antonello Rizzi

Keyword(s):

Pattern Recognition ◽

Real World ◽

Simplicial Complexes ◽

Structural Pattern ◽

Graph Kernels ◽

Structural Pattern Recognition ◽

Learning Tasks ◽

Efficiency And Effectiveness ◽

Benchmark Datasets

Graph kernels are one of the mainstream approaches when dealing with measuring similarity between graphs, especially for pattern recognition and machine learning tasks. In turn, graphs gained a lot of attention due to their modeling capabilities for several real-world phenomena ranging from bioinformatics to social network analysis. However, the attention has been recently moved towards hypergraphs, generalization of plain graphs where multi-way relations (other than pairwise relations) can be considered. In this paper, four (hyper)graph kernels are proposed and their efficiency and effectiveness are compared in a twofold fashion. First, by inferring the simplicial complexes on the top of underlying graphs and by performing a comparison among 18 benchmark datasets against state-of-the-art approaches; second, by facing a real-world case study (i.e., metabolic pathways classification) where input data are natively represented by hypergraphs. With this work, we aim at fostering the extension of graph kernels towards hypergraphs and, more in general, bridging the gap between structural pattern recognition and the domain of hypergraphs.

Download Full-text

Anchors Bring Ease: An Embarrassingly Simple Approach to Partial Multi-View Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301118 ◽

2019 ◽

Vol 33 ◽

pp. 118-125 ◽

Cited By ~ 2

Author(s):

Jun Guo ◽

Jiahui Ye

Keyword(s):

Spectral Clustering ◽

Time Complexity ◽

State Of The Art ◽

Research Problem ◽

Simple Approach ◽

Clustering Methods ◽

Real World Data ◽

The Past ◽

Benchmark Datasets ◽

In Virtue Of

Clustering on multi-view data has attracted much more attention in the past decades. Most previous studies assume that each instance appears in all views, or there is at least one view containing all instances. However, real world data often suffers from missing some instances in each view, leading to the research problem of partial multi-view clustering. To address this issue, this paper proposes a simple yet effective Anchorbased Partial Multi-view Clustering (APMC) method, which utilizes anchors to reconstruct instance-to-instance relationships for clustering. APMC is conceptually simple and easy to implement in practice, besides it has clear intuitions and non-trivial empirical guarantees. Specifically, APMC firstly integrates intra- and inter- view similarities through anchors. Then, spectral clustering is performed on the fused similarities to obtain a unified clustering result. Compared with existing partial multi-view clustering methods, APMC has three notable advantages: 1) it can capture more non-linear relations among instances with the help of kernel-based similarities; 2) it has a much lower time complexity in virtue of a noniterative scheme; 3) it can inherently handle data with negative entries as well as be extended to more than two views. Finally, we extensively evaluate the proposed method on five benchmark datasets. Experimental results demonstrate the superiority of APMC over state-of-the-art approaches.

Download Full-text