scholarly journals Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 664
Author(s):  
Nikos Kanakaris ◽  
Nikolaos Giarelis ◽  
Ilias Siachos ◽  
Nikos Karacapilidis

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.

Author(s):  
Neil Veira ◽  
Brian Keng ◽  
Kanchana Padmanabhan ◽  
Andreas Veneris

Knowledge graph embeddings are instrumental for representing and learning from multi-relational data, with recent embedding models showing high effectiveness for inferring new facts from existing databases. However, such precisely structured data is usually limited in quantity and in scope. Therefore, to fully optimize the embeddings it is important to also consider more widely available sources of information such as text. This paper describes an unsupervised approach to incorporate textual information by augmenting entity embeddings with embeddings of associated words. The approach does not modify the optimization objective for the knowledge graph embedding, which allows it to be integrated with existing embedding models. Two distinct forms of textual data are considered, with different embedding enhancements proposed for each case. In the first case, each entity has an associated text document that describes it. In the second case, a text document is not available, and instead entities occur as words or phrases in an unstructured corpus of text fragments. Experiments show that both methods can offer improvement on the link prediction task when applied to many different knowledge graph embedding models.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 485
Author(s):  
Meihong Wang ◽  
Linling Qiu ◽  
Xiaoli Wang

Knowledge graphs (KGs) have been widely used in the field of artificial intelligence, such as in information retrieval, natural language processing, recommendation systems, etc. However, the open nature of KGs often implies that they are incomplete, having self-defects. This creates the need to build a more complete knowledge graph for enhancing the practical utilization of KGs. Link prediction is a fundamental task in knowledge graph completion that utilizes existing relations to infer new relations so as to build a more complete knowledge graph. Numerous methods have been proposed to perform the link-prediction task based on various representation techniques. Among them, KG-embedding models have significantly advanced the state of the art in the past few years. In this paper, we provide a comprehensive survey on KG-embedding models for link prediction in knowledge graphs. We first provide a theoretical analysis and comparison of existing methods proposed to date for generating KG embedding. Then, we investigate several representative models that are classified into five categories. Finally, we conducted experiments on two benchmark datasets to report comprehensive findings and provide some new insights into the strengths and weaknesses of existing models.


Author(s):  
Thanh Le ◽  
Hoang Nguyen ◽  
Bac Le

Link prediction in knowledge graphs gradually plays an essential role in the field of research and application. Through detecting latent connections, we can refine the knowledge in the graph, discover interesting relationships, answer user questions or make item suggestions. In this paper, we conduct a survey of the methods that are currently achieving good results in link prediction. Specially, we perform surveys on both static and temporal graphs. First, we divide the algorithms into groups based on the characteristic representation of entities and relations. After that, we describe the original idea and analyze the key improvements. In each group, comparisons and investigation on the pros and cons of each method as well as their applications are made. Based on that, the correlation of the two graph types in link prediction is drawn. Finally, from the overview of the link prediction problem, we propose some directions to improve the models for future studies.


Author(s):  
Luisa Andreu ◽  
Enrique Bigne ◽  
Suzanne Amaro ◽  
Jesús Palomo

Purpose The purpose of this study is to examine Airbnb research using bibliometric methods. Using research performance analysis, this study highlights and provides an updated overview of Airbnb research by revealing patterns in journals, papers and most influential authors and countries. Furthermore, it graphically illustrates how research themes have evolved by mapping a co-word analysis and points out potential trends for future research. Design/methodology/approach The methodological design for this study involves three phases: the document source selection, the definition of the variables to be analyzed and the bibliometric analysis. A statistical multivariate analysis of all the documents’ characteristics was performed with R software. Furthermore, natural language processing techniques were used to analyze all the abstracts and keywords specified in the 129 selected documents. Findings Results show the genesis and evolution of publications on Airbnb research, scatter of journals and journals’ characteristics, author and productivity characteristics, geographical distribution of the research and content analysis using keywords. Research limitations/implications Despite Airbnb having a history of 10 years, research publications only started in 2015. Therefore, the bibliometric study includes papers from 2015 to 2019. One of the main limitations is that papers were selected in October of 2019, before the year was over. However, the latest academic publications (in press and earlycite) were included in the analysis. Originality/value This study analyzed bibliometric set of laws (Price’s, Lotka’s and Bradford’s) to better understand the patterns of the most relevant scientific production regarding Airbnb in tourism and hospitality journals. Using natural language processing techniques, this study analyzes all the abstracts and keywords specified in the selected documents. Results show the evolution of research topics in four periods: 2015-2016, 2017, 2018 and 2019.


Author(s):  
Anjali Daisy

Nowadays, as computer systems are expected to be intelligent, techniques that help modern applications to understand human languages are in much demand. Amongst all the techniques, the latent semantic models are the most important. They exploit the latent semantics of lexicons and concepts of human languages and transform them into tractable and machine-understandable numerical representations. Without that, languages are nothing but combinations of meaningless symbols for the machine. To provide such learning representation, embedding models for knowledge graphs have attracted much attention in recent years since they intuitively transform important concepts and entities in human languages into vector representations, and realize relational inferences among them via simple vector calculation. Such novel techniques have effectively resolved a few tasks like knowledge graph completion and link prediction, and show the great potential to be incorporated into more natural language processing (NLP) applications.


Author(s):  
Masaki Asada ◽  
Nallappan Gunasekaran ◽  
Makoto Miwa ◽  
Yutaka Sasaki

We deal with a heterogeneous pharmaceutical knowledge-graph containing textual information built from several databases. The knowledge graph is a heterogeneous graph that includes a wide variety of concepts and attributes, some of which are provided in the form of textual pieces of information which have not been targeted in the conventional graph completion tasks. To investigate the utility of textual information for knowledge graph completion, we generate embeddings from textual descriptions given to heterogeneous items, such as drugs and proteins, while learning knowledge graph embeddings. We evaluate the obtained graph embeddings on the link prediction task for knowledge graph completion, which can be used for drug discovery and repurposing. We also compare the results with existing methods and discuss the utility of the textual information.


Author(s):  
Jan H Kroeze

A very large percentage of business and academic data is stored in textual format. With the exception of metadata, such as author, date, title and publisher, this data is not overtly structured like the standard, mainly numerical, data in relational databases. Parallel to data mining, which finds new patterns and trends in numerical data, text mining is the process aimed at discovering unknown patterns in free text. Owing to the importance of competitive and scientific knowledge that can be exploited from these texts, “text mining has become an increasingly popular and essential theme in data mining” (Han & Kamber, 2001, p. 428). Text mining is an evolving field and its relatively short history goes hand in hand with the recent explosion in availability of electronic textual information. Chen (2001, p. vi) remarks that “text mining is an emerging technical area that is relatively unknown to IT professions”. This explains the fact that despite the value of text mining, most research and development efforts still focus on data mining using structured data (Fan et al., 2006). In the next section, the background and need for text mining will be discussed after which the various uses and techniques of text mining are described. The importance of visualisation and some critical issues will then be discussed followed by some suggestions for future research topics.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Wei Du ◽  
Qiang Yan ◽  
Wenping Zhang ◽  
Jian Ma

PurposePatent trade recommendations necessitate recommendation interpretability in addition to recommendation accuracy because of patent transaction risks and the technological complexity of patents. This study designs an interpretable knowledge-aware patent recommendation model (IKPRM) for patent trading. IKPRM first creates a patent knowledge graph (PKG) for patent trade recommendations and then leverages paths in the PKG to achieve recommendation interpretability.Design/methodology/approachFirst, we construct a PKG to integrate online company behaviors and patent information using natural language processing techniques. Second, a bidirectional long short-term memory network (BiLSTM) is utilized with an attention mechanism to establish the connecting paths of a company — patent pair in PKG. Finally, the prediction score of a company — patent pair is calculated by assigning different weights to their connecting paths. The semantic relationships in connecting paths help explain why a candidate patent is recommended.FindingsExperiments on a real dataset from a patent trading platform verify that IKPRM significantly outperforms baseline methods in terms of hit ratio and normalized discounted cumulative gain (nDCG). The analysis of an online user study verified the interpretability of our recommendations.Originality/valueA meta-path-based recommendation can achieve certain explainability but suffers from low flexibility when reasoning on heterogeneous information. To bridge this gap, we propose the IKPRM to explain the full paths in the knowledge graph. IKPRM demonstrates good performance and transparency and is a solid foundation for integrating interpretable artificial intelligence into complex tasks such as intelligent recommendations.


Technologies ◽  
2018 ◽  
Vol 6 (4) ◽  
pp. 100 ◽  
Author(s):  
Jayden Khakurel ◽  
Birgit Penzenstadler ◽  
Jari Porras ◽  
Antti Knutas ◽  
Wenlu Zhang

Since the 1950s, artificial intelligence (AI) has been a recurring topic in research. However, this field has only recently gained significant momentum because of the advances in technology and algorithms, along with new AI techniques such as machine learning methods for structured data, modern deep learning, and natural language processing for unstructured data. Although companies are eager to join the fray of this new AI trend and take advantage of its potential benefits, it is unclear what implications AI will have on society now and in the long term. Using the five dimensions of sustainability to structure the analysis, we explore the impacts of AI on several domains. We find that there is a significant impact on all five dimensions, with positive and negative impacts, and that value, collaboration, sharing responsibilities; ethics will play a vital role in any future sustainable development of AI in society. Our exploration provides a foundation for in-depth discussions and future research collaborations.


Sign in / Sign up

Export Citation Format

Share Document