Incorporating heterogeneous information in deep learning with informative meta-paths for community recommendations

Communities of interest promote knowledge sharing and discovery in social network platforms. However, platform users face difficulties of finding suitable communities, given their increasing number. Although recommendations have been proposed to help users find communities of interest, these methods ignore or exclude heterogeneous interactions between users and communities. In addition, widely used meta-paths help capture the complex semantic relation among entities but heavily rely on domain knowledge. In this study, we propose a novel recommendation model based on informative meta-path discovery in heterogeneous information networks and deep learning. Users, communities, relevant items and their relations are considered as entities in a heterogeneous information network, from where informative meta-paths are extracted on the basis of information theory to measure user-community similarities. Finally, similarities are incorporated in a deep learning model to predict whether target users join candidate communities. The proposed recommendation model is evaluated and compared against baseline methods using two data sets. Results demonstrate the superior performance of the present model in terms of precision, recall and F score.

Download Full-text

Reinforcement Learning Based Meta-Path Discovery in Large-Scale Heterogeneous Information Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6073 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6094-6101

Author(s):

Guojia Wan ◽

Bo Du ◽

Shirui Pan ◽

Gholameza Haffari

Keyword(s):

Reinforcement Learning ◽

Domain Knowledge ◽

Large Scale ◽

Information Networks ◽

Superior Performance ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Lowest Common Ancestor ◽

Path Discovery ◽

Meta Path

Meta-paths are important tools for a wide variety of data mining and network analysis tasks in Heterogeneous Information Networks (HINs), due to their flexibility and interpretability to capture the complex semantic relation among objects. To date, most HIN analysis still relies on hand-crafting meta-paths, which requires rich domain knowledge that is extremely difficult to obtain in complex, large-scale, and schema-rich HINs. In this work, we present a novel framework, Meta-path Discovery with Reinforcement Learning (MPDRL), to identify informative meta-paths from complex and large-scale HINs. To capture different semantic information between objects, we propose a novel multi-hop reasoning strategy in a reinforcement learning framework which aims to infer the next promising relation that links a source entity to a target entity. To improve the efficiency, moreover, we develop a type context representation embedded approach to scale the RL framework to handle million-scale HINs. As multi-hop reasoning generates rich meta-paths with various length, we further perform a meta-path induction step to summarize the important meta-paths using Lowest Common Ancestor principle. Experimental results on two large-scale HINs, Yago and NELL, validate our approach and demonstrate that our algorithm not only achieves superior performance in the link prediction task, but also identifies useful meta-paths that would have been ignored by human experts.

Download Full-text

Tensorflow Based Deep Learning Model and Snakemake Workflow for Peptide-Protein Binding Predictions

10.1101/410928 ◽

2018 ◽

Author(s):

Gokmen Altay

Keyword(s):

Deep Learning ◽

Learning Community ◽

Domain Knowledge ◽

Peptide Binding ◽

Data Sets ◽

Binding Problem ◽

Histocompatibility Complex ◽

Community Effort ◽

And Performance ◽

Deep Learning Model

AbstractIn this study, we first present a Tensorflow based Deep Learning (DL) model that provides high performances in predicting the binding of peptides to major histocompatibility complex (MHC) class I protein. Second, we provide the necessary Python codes to run the model and also easily input large train and test peptide binding benchmark dataset. Third, we provide Snakemake based workflow that allows to run all the model and performance analysis over all the different test alleles at once in parallel over computer and clusters. We also provide comparison analysis of the performances of various models. Finally, in order to help attaining to the best possible DL model by a community effort, this work is intended to be a ready to modify base model and workflow for the global Deep Learning community with no domain knowledge in MHC-peptide binding problem and thus provides all the necessary reference code templates and benchmarking data sets for further developments on the presented model architecture. All the reproducible Python codes, Snakemake workflow and benchmark data sets and a tutorial are available online at https://github.com/altayg/Deep-Learning-MHCI.

Download Full-text

HeteClass: A Meta-path based framework for transductive classification of objects in heterogeneous information networks

Expert Systems with Applications ◽

10.1016/j.eswa.2016.10.013 ◽

2017 ◽

Vol 68 ◽

pp. 106-122 ◽

Cited By ~ 17

Author(s):

Mukul Gupta ◽

Pradeep Kumar ◽

Bharat Bhasker

Keyword(s):

Information Networks ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path

Download Full-text

CHIN: Classification with META-PATH in Heterogeneous Information Networks

Communications in Computer and Information Science - Applied Informatics ◽

10.1007/978-3-030-01535-0_5 ◽

2018 ◽

pp. 63-74 ◽

Cited By ~ 1

Author(s):

Jinli Zhang ◽

Zongli Jiang ◽

Tong Li

Keyword(s):

Information Networks ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path

Download Full-text

Classification of Clinically Significant Prostate Cancer on Multi-Parametric MRI: A Validation Study Comparing Deep Learning and Radiomics

Cancers ◽

10.3390/cancers14010012 ◽

2021 ◽

Vol 14 (1) ◽

pp. 12

Author(s):

Jose M. Castillo T. ◽

Muhammad Arif ◽

Martijn P. A. Starmans ◽

Wiro J. Niessen ◽

Chris H. Bangma ◽

...

Keyword(s):

Prostate Cancer ◽

Deep Learning ◽

Characteristic Curve ◽

Model Development ◽

Learning Model ◽

Multiparametric Mri ◽

Data Sets ◽

Data Set ◽

Test Sets ◽

Deep Learning Model

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.

Download Full-text

Meta path-based collective classification in heterogeneous information networks

Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12 ◽

10.1145/2396761.2398474 ◽

2012 ◽

Cited By ~ 54

Author(s):

Xiangnan Kong ◽

Philip S. Yu ◽

Ying Ding ◽

David J. Wild

Keyword(s):

Information Networks ◽

Collective Classification ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path

Download Full-text

A k-NN-Based Approach Using MapReduce for Meta-path Classification in Heterogeneous Information Networks

Soft Computing in Data Analytics - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-13-0514-6_28 ◽

2018 ◽

pp. 277-284 ◽

Cited By ~ 2

Author(s):

Sadhana Kodali ◽

Madhavi Dabbiru ◽

B. Thirumala Rao ◽

U. Kartheek Chandra Patnaik

Keyword(s):

Information Networks ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path

Download Full-text

Clustering via Meta-path Embedding for Heterogeneous Information Networks

2020 IEEE International Conference on Knowledge Graph (ICKG) ◽

10.1109/icbk50248.2020.00036 ◽

2020 ◽

Author(s):

Yongjun Zhang ◽

Xiaoping Yang ◽

Liang Wang

Keyword(s):

Information Networks ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path

Download Full-text

Meta-Path Based Inductive Classification in Heterogeneous Information Networks

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5623 ◽

2016 ◽

Vol 13 (10) ◽

pp. 6747-6753

Author(s):

Pingjian Ding ◽

Xiangtao Chen ◽

Zipin Guan

Keyword(s):

Heterogeneous Networks ◽

State Of The Art ◽

Test Sample ◽

Information Networks ◽

Classification Problems ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Meta Path ◽

Transductive Inference ◽

The Given

The goal of inductive classification approaches is to infer the correct mapping from test set to labels, while the goal of transductive inference is to predict the correct labels for the given unlabeled data. Hence, the increased unlabeled samples can’t be classified by transductive classification. In this paper, we focus on studying the inductive classification problems in heterogeneous networks, which involve multiple types of objects interconnected by multiple types of links. Moreover, the objects and the links are gradually increasing over time. To accommodate characteristics of heterogeneous networks, a meta-path-based heterogeneous inductive classification (Hic) was proposed. First, the different sub-networks were constructed according to the selected meta-path. Second, the characteristic paths of each sub-network were extracted via the specified minimum support, and were assigned appropriate weights. Then, Hic model based on characteristic path was built. Finally, the Hic scores of each classification label for each test sample was calculated via links between test samples and sub-networks. Experiments on the DBLP showed that the proposed method significantly improves the accuracy and stability over the existing state-of-the-art methods for classification in dynamic heterogeneous network.

Download Full-text

ECOVNet: a highly effective ensemble based deep learning model for detecting COVID-19

PeerJ Computer Science ◽

10.7717/peerj-cs.551 ◽

2021 ◽

Vol 7 ◽

pp. e551

Author(s):

Nihad Karim Chowdhury ◽

Muhammad Ashad Kabir ◽

Md. Muhtadir Rahman ◽

Noortaz Rezoana

Keyword(s):

Deep Learning ◽

Detection System ◽

Learning Model ◽

Classification Performance ◽

Data Sets ◽

X Rays ◽

Proposed Model ◽

Highly Effective ◽

Chest X Ray ◽

Deep Learning Model

The goal of this research is to develop and implement a highly effective deep learning model for detecting COVID-19. To achieve this goal, in this paper, we propose an ensemble of Convolutional Neural Network (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 from chest X-rays. To make the proposed model more robust, we have used one of the largest open-access chest X-ray data sets named COVIDx containing three classes—COVID-19, normal, and pneumonia. For feature extraction, we have applied an effective CNN structure, namely EfficientNet, with ImageNet pre-training weights. The generated features are transferred into custom fine-tuned top layers followed by a set of model snapshots. The predictions of the model snapshots (which are created during a single training) are consolidated through two ensemble strategies, i.e., hard ensemble and soft ensemble, to enhance classification performance. In addition, a visualization technique is incorporated to highlight areas that distinguish classes, thereby enhancing the understanding of primal components related to COVID-19. The results of our empirical evaluations show that the proposed ECOVNet model outperforms the state-of-the-art approaches and significantly improves detection performance with 100% recall for COVID-19 and overall accuracy of 96.07%. We believe that ECOVNet can enhance the detection of COVID-19 disease, and thus, underpin a fully automated and efficacious COVID-19 detection system.

Download Full-text