A Comprehensive Evaluation of Graph Kernels for Unattributed Graphs

Graph kernels are of vital importance in the field of graph comparison and classification. However, how to compare and evaluate graph kernels and how to choose an optimal kernel for a practical classification problem remain open problems. In this paper, a comprehensive evaluation framework of graph kernels is proposed for unattributed graph classification. According to the kernel design methods, the whole graph kernel family can be categorized in five different dimensions, and then several representative graph kernels are chosen from these categories to perform the evaluation. With plenty of real-world and synthetic datasets, kernels are compared by many criteria such as classification accuracy, F1 score, runtime cost, scalability and applicability. Finally, quantitative conclusions are discussed based on the analyses of the extensive experimental results. The main contribution of this paper is that a comprehensive evaluation framework of graph kernels is proposed, which is significant for graph-classification applications and the future kernel research.

Download Full-text

A Combined Dimensional Kernel Method for Graph Classification

Journal of Information Technology Research ◽

10.4018/jitr.2017070102 ◽

2017 ◽

Vol 10 (3) ◽

pp. 22-33

Author(s):

Tiejun Cao

Keyword(s):

Structural Information ◽

Kernel Method ◽

Three Dimensional ◽

Graph Classification ◽

Effective Technique ◽

Quadratic Constraints ◽

Molecular Chemistry ◽

One Dimensional ◽

Different Dimensions ◽

Optimal Kernel

The data containing structural information is an important problem in the field of machine learning. Kernel methods is an effective technique for solving such problems. A combined dimension kernel method is proposed or graph classification in this paper. A two-dimensional kernel is first constructed in this method, and it incorporates one-dimensional information to characterize the molecular chemistry, and then a three-dimensional kernel is constructed based on the knowledge of molecular mechanics to characterize the physical properties of the molecule. On this basis, the kernel of different dimensions is integrated, and the quadratic programming problem with quadratic constraints is solved to obtain the optimal kernel combination. The experimental results show that the proposed method has better performance than the prior technology, and it outperforms the existing algorithm.

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

Evaluating Usability of Academic Websites through a Fuzzy Analytical Hierarchical Process

Sustainability ◽

10.3390/su13042040 ◽

2021 ◽

Vol 13 (4) ◽

pp. 2040

Author(s):

AbdulHafeez Muhammad ◽

Ansar Siddique ◽

Quadri Noorulhasan Naveed ◽

Uzma Khaliq ◽

Ali M. Aseere ◽

...

Keyword(s):

Evaluation Framework ◽

Analytic Hierarchy ◽

Academic Tasks ◽

Frequency Of Use ◽

Wide Range ◽

Different Dimensions ◽

Academic Websites ◽

Academic Information ◽

Hierarchy Process ◽

Usability Criteria

In the higher education sector, there is a growing trend to offer academic information to users through websites. Contemporarily, the users (i.e., students/teachers, parents, and administrative staff) greatly rely on these websites to perform various academic tasks, including admission, access to learning management systems (LMS), and links to other relevant resources. These users vary from each other in terms of their technological competence, objectives, and frequency of use. Therefore, academic websites should be designed considering different dimensions, so that everybody can be accommodated. Knowing the different dimensions with respect to the usability of academic websites is a multi-criteria decision-making (MCDM) problem. The fuzzy analytic hierarchy process (FAHP) approach has been considered to be a significant method to deal with the uncertainty that is involved in subjective judgment. Although a wide range of usability factors for academic websites have already been identified, most of them are based on the judgment of experts who have never used these websites. This study identified important factors through a detailed literature review, classified them, and prioritized the most critical among them through the FAHP methodology, involving relevant users to propose a usability evaluation framework for academic websites. To validate the proposed framework, five websites of renowned higher educational institutes (HEIs) were evaluated and ranked according to the usability criteria. As the proposed framework was created methodically, the authors believe that it would be helpful for detecting real usability issues that currently exist in academic websites.

Download Full-text

Addressing Impact Evaluation Gaps in Belt and Road Initiative Projects in Africa: The Standard Gauge Railway Project in Kenya as a Proof of Concept

The African Review ◽

10.1163/1821889x-12340026 ◽

2020 ◽

pp. 1-38

Author(s):

Keren Zhu ◽

Rafiq Dossani ◽

Jennifer Bouey

Keyword(s):

Impact Evaluation ◽

Comprehensive Evaluation ◽

Stakeholder Participation ◽

Social Benefit ◽

Evaluation Framework ◽

Systematic Evaluation ◽

Belt And Road Initiative ◽

Proof Of Concept ◽

Belt And Road ◽

The Impact

Abstract The impact of the Belt and Road Initiative (BRI) to global development will be unprecedented and significant, and developmental impact evaluation is therefore central to understanding BRI projects and making informed decisions. Compared with evaluations of individual projects and programs, evaluation of large and mega infrastructure projects under the BRI is particularly challenging and complex in integrating stakeholder objectives, accounting for social benefit and costs, and tracking long-term project impact. In this paper, we summarize the key drawbacks of existing BRI evaluation frameworks, propose a systematic evaluation framework elicitation method based on the inputs from BRI subject matter experts and verified through stakeholder participation, and apply an interim evaluation framework in understanding the Mombasa-Nairobi Standard Gauge Railway project in Kenya, as a proof of concept of a comprehensive evaluation framework. In doing so, we seek to provide a tool for BRI decision makers and stakeholders to assess these projects holistically at planning, construction and operation stages.

Download Full-text

Multiple weak supervision for short text classification

Applied Intelligence ◽

10.1007/s10489-021-02958-3 ◽

2022 ◽

Author(s):

Li-Ming Chen ◽

Bao-Xin Xiu ◽

Zhao-Yun Ding

Keyword(s):

Text Classification ◽

Classification Problem ◽

Experimental Results ◽

Prior Work ◽

Weak Supervision ◽

Short Text ◽

Imbalanced Classification ◽

Distant Supervision ◽

Synthetic Datasets ◽

Independent Model

AbstractFor short text classification, insufficient labeled data, data sparsity, and imbalanced classification have become three major challenges. For this, we proposed multiple weak supervision, which can label unlabeled data automatically. Different from prior work, the proposed method can generate probabilistic labels through conditional independent model. What’s more, experiments were conducted to verify the effectiveness of multiple weak supervision. According to experimental results on public dadasets, real datasets and synthetic datasets, unlabeled imbalanced short text classification problem can be solved effectively by multiple weak supervision. Notably, without reducing precision, recall, and F1-score can be improved by adding distant supervision clustering, which can be used to meet different application needs.

Download Full-text

Using API Call Sequences for IoT Malware Classification Based on Convolutional Neural Networks

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819402140009x ◽

2021 ◽

Vol 31 (04) ◽

pp. 587-612

Author(s):

Qianguang Lin ◽

Ni Li ◽

Qi Qi ◽

Jiabin Hu

Keyword(s):

Time Series ◽

Time Series Data ◽

Comprehensive Evaluation ◽

Optimal Solution ◽

Classification Problem ◽

Experimental Results ◽

Series Data ◽

Chi Square ◽

Malware Classification ◽

Model Training

Internet of Things (IoT) devices built on different processor architectures have increasingly become targets of adversarial attacks. In this paper, we propose an algorithm for the malware classification problem of the IoT domain to deal with the increasingly severe IoT security threats. Application executions are represented by sequences of consecutive API calls. The time series of data is analyzed and filtered based on the improved information gains. It performs more effectively than chi-square statistics, in reducing the sequence lengths of input data meanwhile keeping the important information, according to the experimental results. We use a multi-layer convolutional neural network to classify various types of malwares, which is suitable for processing time series data. When the convolution window slides down the time sequence, it can obtain higher-level positions by collecting different sequence features, thereby understanding the characteristics of the corresponding sequence position. By comparing the iterative efficiency of different optimization algorithms in the model, we select an algorithm that can approximate the optimal solution to a small number of iterations to speed up the convergence of the model training. The experimental results from real world IoT malware sample show that the classification accuracy of this approach can reach more than 98%. Overall, our method has demonstrated practical suitability for IoT malware classification with high accuracies and low computational overheads by undergoing a comprehensive evaluation.

Download Full-text

Evaluating the Impact and ROI of Medical Education Programs

Cases on Instructional Design and Performance Outcomes in Medical Education - Advances in Medical Education, Research, and Ethics ◽

10.4018/978-1-7998-5092-2.ch013 ◽

2020 ◽

pp. 261-293

Author(s):

Timothy R. Brock

Keyword(s):

Medical Education ◽

Evaluation System ◽

Process Model ◽

Comprehensive Evaluation ◽

Evaluation Framework ◽

Evaluation Methodology ◽

Education Programs ◽

Organizational Excellence ◽

Medical Education Program ◽

Measurement And Evaluation

Medical education programs must deliver valued results that stakeholders expect in return for their funding investments. In the past, healthcare organizations accepted reports about test results and participant perceptions of the program as adequate evidence of course outcomes. Today, program funders expect evaluations that provide evidence that medical education programs improve organizational excellence measures to justify ongoing funding. This chapter will explain four of the five elements required of a proven, comprehensive evaluation system. This five-element system is necessary to provide the desired organizational excellence evidence that medical educators can adopt to address the needs of stakeholders at different levels of an organization. Specifically, this chapter will overview an evaluation framework, a process model, and guiding principles that are crucial elements of this methodology. The chapter ends with a case study that shows how a medical education team used this measurement and evaluation methodology to plan how they would design and evaluate a medical education program requested by executives to solve an ICU central line infection problem.

Download Full-text

GRAPH CLASSIFICATION BASED ON VECTOR SPACE EMBEDDING

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800140900748x ◽

2009 ◽

Vol 23 (06) ◽

pp. 1053-1081 ◽

Cited By ~ 48

Author(s):

KASPAR RIESEN ◽

HORST BUNKE

Keyword(s):

Data Sets ◽

Object Representations ◽

Graph Classification ◽

Graph Edit Distance ◽

Graph Kernels ◽

Symbolic Data ◽

Representational Power ◽

High Degree ◽

Vectorial Data ◽

Representation Formalism

Graphs provide us with a powerful and flexible representation formalism for pattern classification. Many classification algorithms have been proposed in the literature. However, the vast majority of these algorithms rely on vectorial data descriptions and cannot directly be applied to graphs. Recently, a growing interest in graph kernel methods can be observed. Graph kernels aim at bridging the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. In the present paper, we propose an approach transforming graphs into n-dimensional real vectors by means of prototype selection and graph edit distance computation. This approach allows one to build graph kernels in a straightforward way. It is not only applicable to graphs, but also to other kind of symbolic data in conjunction with any kind of dissimilarity measure. Thus it is characterized by a high degree of flexibility. With several experimental results, we prove the robustness and flexibility of our new method and show that our approach outperforms other graph classification methods on several graph data sets of diverse nature.

Download Full-text