A Survey of Data Mining Techniques on Information Networks

Sadhana Kodali; Madhavi Dabbiru; B Thirumala Rao

doi:10.14419/ijet.v7i2.6.11267

A Survey of Data Mining Techniques on Information Networks

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.6.11267 ◽

2018 ◽

Vol 7 (2.6) ◽

pp. 293

Author(s):

Sadhana Kodali ◽

Madhavi Dabbiru ◽

B Thirumala Rao

Keyword(s):

Data Mining ◽

Social Media ◽

Information Networks ◽

Information Network ◽

Heterogeneous Information ◽

Data Mining Techniques ◽

Heterogeneous Information Networks ◽

Web Objects ◽

The Social ◽

Social Media Network

An Information Network is the network formed by the interconnectivity of the objects formed due to the interaction between them. In our day-to-day life we can find these information networks like the social media network, the network formed by the interaction of web objects etc. This paper presents a survey of various Data Mining techniques that can be applicable to information networks. The Data Mining techniques of both homogeneous and heterogeneous information networks are discussed in detail and a comparative study on each problem category is showcased.

Download Full-text

Mining on Social Media Data: To Determine the Personality of Unrevealed Person

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4544.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 8574-8577

Keyword(s):

Data Mining ◽

Social Media ◽

Information Mining ◽

Social Media Data ◽

Data Mining Techniques ◽

Network Discovery ◽

The Social ◽

Mining Methods ◽

Using Data ◽

Media Data

The unavoidable utilization of online networking like Facebook is giving exceptional measures of social information. Information mining methods have been broadly used to separate learning from such information. The character of the person is predicted whether he is good or not by using data mining techniques from user self-made data. Mining methods are being broadly using to separate learning from such information, main examples for them are network discovery and slant investigation. Notwithstanding, there is still a lot of room to investigate as far as the occasion information (i.e., occasions with timestamps, for example, posting an inquiry, altering an article in Wikipedia, and remarking on a tweet. These occasions react users' personal conduct standards and working forms in the social media websites.

Download Full-text

A comparative study on heterogeneous information network embeddings

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-191796 ◽

2020 ◽

Vol 39 (3) ◽

pp. 3463-3473

Author(s):

Fujiao Ji ◽

Zhongying Zhao ◽

Hui Zhou ◽

Heng Chi ◽

Chao Li

Keyword(s):

Comparative Study ◽

Communication Networks ◽

Problem Definition ◽

Information Networks ◽

Future Research ◽

Information Network ◽

Network Embedding ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Heterogeneous Information Networks

Heterogeneous information networks are widely used to represent real world applications in forms of social networks, word co-occurrence networks, and communication networks, etc. However, It is difficult for traditional machine learning methods to analyze these networks effectively. Heterogeneous information network embedding aims to convert the network into low dimensional vectors, which facilitates the following tasks. Thus it is receiving tremendous attention from the research community due to its effectiveness and efficiency. Although numerous methods have been present and applied successfully, there are few works to make a comparative study on heterogeneous information network embedding, which is very important for developers and researchers to select an appropriate method. To address the above problem, we make a comparative study on the heterogeneous information network embeddings. Specifically, we first give the problem definition of heterogeneous information network embedding. Then the heterogeneous information networks are classified into four categories from the perspective of network type. The state-of-the-art methods for each category are also compared and reviewed. Finally, we make a conclusion and suggest some potential future research directions.

Download Full-text

Distributed Algorithms for Finding Meta-Paths of a Large Heterogeneous Information Network on Cloud

Modern Principles, Practices, and Algorithms for Cloud Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-1082-7.ch011 ◽

2020 ◽

pp. 223-249

Author(s):

Phuc Do

Keyword(s):

Decision Making ◽

Distributed Algorithms ◽

Information Networks ◽

Information Network ◽

Breadth First Search ◽

Cloud Computing Environment ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Path Discovery ◽

Meta Path

Meta-path is an important concept of heterogeneous information networks (HINs). Meta-paths were used in many tasks such as information retrieval, decision making, and product recommendation. Normally meta-paths were proposed by human experts. Recently, works on meta-path discovery have proposed in-memory solutions that fit in one computer. With large HINs, the whole HIN cannot be loaded in the memory. In this chapter, the authors proposed distributed algorithms to discover meta-paths of large HINs on cloud. They develop the distributed algorithms to discover the significant meta-path, maximal significant meta-path, and top-k meta-paths between two vertices of HIN. Calculation of the support of meta-paths or performing breadth first search can be computational costly in very large HINs. Conveniently, the distributed algorithms utilize the GraphFrames library of Apache Spark on cloud computing environment to efficiently query large HINs. The authors conduct the experiments on large DBLP dataset to prove the performance of our algorithms on cloud.

Download Full-text

Application of data mining techniques to stakeholder sentiment analysis towards corporate social responsibility in the social media: a case study on S&P 500 firms

International Journal of Web Science ◽

10.1504/ijws.2013.056573 ◽

2013 ◽

Vol 2 (1/2) ◽

pp. 27

Author(s):

Markus Stiglbauer ◽

Christian Häußinger

Keyword(s):

Data Mining ◽

Social Media ◽

Corporate Social Responsibility ◽

Social Responsibility ◽

Sentiment Analysis ◽

Data Mining Techniques ◽

The Social ◽

Corporate Social

Download Full-text

ABLE: Meta-Path Prediction in Heterogeneous Information Networks

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3494558 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-21

Author(s):

Chenji Huang ◽

Yixiang Fang ◽

Xuemin Lin ◽

Xin Cao ◽

Wenjie Zhang

Keyword(s):

State Of The Art ◽

Information Networks ◽

Information Network ◽

Embedding Method ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Head Node ◽

Meta Path ◽

Path Prediction ◽

Ap Scores

Given a heterogeneous information network (HIN) H, a head node h , a meta-path P, and a tail node t , the meta-path prediction aims at predicting whether h can be linked to t by an instance of P. Most existing solutions either require predefined meta-paths, which limits their scalability to schema-rich HINs and long meta-paths, or do not aim at predicting the existence of an instance of P. To address these issues, in this article, we propose a novel prediction model, called ABLE, by exploiting the A ttention mechanism and B i L STM for E mbedding. Particularly, we present a concatenation node embedding method by considering the node types and a dynamic meta-path embedding method that carefully considers the importance and positions of edge types in the meta-paths by the Attention mechanism and BiLSTM model, respectively. A triplet embedding is then derived to complete the prediction. We conduct extensive experiments on four real datasets. The empirical results show that ABLE outperforms the state-of-the-art methods by up to 20% and 22% of improvement of AUC and AP scores, respectively.

Download Full-text

Heterogeneous information network based clustering for precision traditional Chinese medicine

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0963-0 ◽

2019 ◽

Vol 19 (S6) ◽

Author(s):

Xintian Chen ◽

Chunyang Ruan ◽

Yanchun Zhang ◽

Huijuan Chen

Keyword(s):

Chinese Medicine ◽

Traditional Chinese Medicine ◽

Large Scale ◽

Graphical Model ◽

Information Networks ◽

Structured Learning ◽

Information Network ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Heterogeneous Information Networks

Abstract Background Traditional Chinese medicine (TCM) is a highly important complement to modern medicine and is widely practiced in China and in many other countries. The work of Chinese medicine is subject to the two factors of the inheritance and development of clinical experience of famous Chinese medicine practitioners and the difficulty in improving the service capacity of basic Chinese medicine practitioners. Heterogeneous information networks (HINs) are a kind of graphical model for integrating and modeling real-world information. Through HINs, we can integrate and model the large-scale heterogeneous TCM data into structured graph data and use this as a basis for analysis. Methods Mining categorizations from TCM data is an important task for precision medicine. In this paper, we propose a novel structured learning model to solve the problem of formula regularity, a pivotal task in prescription optimization. We integrate clustering with ranking in a heterogeneous information network. Results The results from experiments on the Pharmacopoeia of the People’s Republic of China (ChP) demonstrate the effectiveness and accuracy of the proposed model for discovering useful categorizations of formulas. Conclusions We use heterogeneous information networks to model TCM data and propose a TCM-HIN. Combining the heterogeneous graph with the probability graph, we proposed the TCM-Clus algorithm, which combines clustering with ranking and classifies traditional Chinese medicine prescriptions. The results of the categorizations can help Chinese medicine practitioners to make clinical decision.

Download Full-text

Link Trustworthiness Evaluation over Multiple Heterogeneous Information Networks

Complexity ◽

10.1155/2021/6615179 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Meng Wang ◽

Xu Qin ◽

Wei Jiang ◽

Chunshu Li ◽

Guilin Qi

Keyword(s):

Real World ◽

Evaluation Model ◽

Information Networks ◽

Information Network ◽

Training Procedure ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Proposed Model ◽

Downstream Analysis

Link trustworthiness evaluation is a crucial task for information networks to evaluate the probability of a link being true in a heterogeneous information network (HIN). This task can significantly influence the effectiveness of downstream analysis. However, the performance of existing evaluation methods is limited, as they can only utilize incomplete or one-sided information from a single HIN. To address this problem, we propose a novel multi-HIN link trustworthiness evaluation model that leverages information across multiple related HINs to accomplish link trustworthiness evaluation tasks inherently and efficiently. We present an effective method to evaluate and select informative pairs across HINs and an integrated training procedure to balance inner-HIN and inter-HIN trustworthiness. Experiments on a real-world dataset demonstrate that our proposed model outperforms baseline methods and achieves the best accuracy and F1-score in downstream tasks of HINs.

Download Full-text

Content-expressive behavior and ideological extremity: An examination of the roles of emotional intelligence and information network heterogeneity

New Media & Society ◽

10.1177/1461444816675183 ◽

2016 ◽

Vol 20 (2) ◽

pp. 815-834 ◽

Cited By ~ 6

Author(s):

Matthew Barnidge ◽

Alberto Ardèvol-Abreu ◽

Homero Gil de Zúñiga

Keyword(s):

Emotional Intelligence ◽

Information Networks ◽

Information Network ◽

Panel Survey ◽

Expressive Behavior ◽

Heterogeneous Information ◽

Network Heterogeneity ◽

Heterogeneous Information Networks ◽

The Relationship

One thriving area of research on participatory media revolves around political expression and the creation of political content. This study analyzes the connections between these behaviors, heterogeneous information networks, and ideological extremity while accounting for the role of emotional intelligence. Results from a two-wave-panel survey of US adults show that people who engage in content-expressive behavior are embedded in heterogeneous information networks and that emotional intelligence moderates the relationship between content-expressive behavior and ideological extremity.

Download Full-text

An Efficient Recommendation Algorithm Based on Heterogeneous Information Network

Complexity ◽

10.1155/2021/6689323 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Ying Yin ◽

Wanning Zheng

Keyword(s):

Real Data ◽

Structural Features ◽

Information Networks ◽

Information Network ◽

Complex Objects ◽

Recommendation Algorithm ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Manual Search

Heterogeneous information networks can naturally simulate complex objects, and they can enrich recommendation systems according to the connections between different types of objects. At present, a large number of recommendation algorithms based on heterogeneous information networks have been proposed. However, the existing algorithms cannot extract and combine the structural features in heterogeneous information networks. Therefore, this paper proposes an efficient recommendation algorithm based on heterogeneous information network, which uses the characteristics of graph convolution neural network to automatically learn node information to extract heterogeneous information and avoid errors caused by the manual search for metapaths. Furthermore, by fully considering the scoring relationship between nodes, a calculation strategy combining heterogeneous information and a scoring information fusion strategy is proposed to solve the scoring between nodes, which makes the prediction scoring more accurate. Finally, by updating the nodes, the training scale is reduced, and the calculation efficiency is improved. The study conducted a large number of experiments on three real data sets with millions of edges. The results of the experiments show that compared with PMF, SemRec, and other algorithms, the proposed algorithm improves the recommendation accuracy MAE by approximately 3% and the RMSE by approximately 8% and reduces the time consumption significantly.

Download Full-text

Analysis of Mining Social Media – A Learners Perspective

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0017 ◽

2020 ◽

pp. 91-96

Keyword(s):

Data Mining ◽

Social Media ◽

Performance Metrics ◽

Research Focus ◽

Social Media Data ◽

Data Mining Techniques ◽

Huge Data ◽

The Social ◽

The Common ◽

Media Data

In the world of digitalization the data play a key role. The data may be in structured or unstructured. The structured data uses data mining techniques to find the unknown pattern from the known data. But, the social media has huge data due to its rapid growth, the data were dynamic and unstructured. Due to this traditional data mining techniques will not be appropriate. The combinational approach of data mining and social media will provide the user to gain an insight and prominent idea how can be mined. Social media provides each individual to connect with the others depending on their interest. Every individual are accessing Face book, Twitter, LinkedIn, cademicia.edu, Google+ for sharing their views and thoughts, day-to-day happenings with any one or more of the above sites. This paper give an idea of the how those sites are classified based on their size, data, research focus, design issues and the types of the sites, types of users and the common approaches on social networks which will help the researchers how the social media, social networking websites structurally classified, studies the existing data mining techniques along with the performance metrics used in past researches and tools for retrieving social media data.

Download Full-text