Data Set and Evaluation of Automated Construction of Financial Knowledge Graph

Abstract With the development of entity extraction, relationship extraction, knowledge reasoning, and entity linking, knowledge graph technology has been in full swing in recent years. To better promote the development of knowledge graph, especially in the Chinese language and in the financial industry, we built a high-quality data set, named financial research report knowledge graph (FR2KG), and organized the automated construction of financial knowledge graph evaluation at the 2020 China Knowledge Graph and Semantic Computing Conference (CCKS2020). FR2KG consists of 17,799 entities, 26,798 relationship triples, and 1,328 attribute triples covering 10 entity types, 19 relationship types, and 6 attributes. Participants are required to develop a constructor that will automatically construct a financial knowledge graph based on the FR2KG. In addition, we summarized the technologies for automatically constructing knowledge graphs, and introduced the methods used by the winners and the results of this evaluation.

Download Full-text

Preliminary Study on the Knowledge Graph Construction of Chinese Ancient History and Culture

Information ◽

10.3390/info11040186 ◽

2020 ◽

Vol 11 (4) ◽

pp. 186 ◽

Cited By ~ 4

Author(s):

Shuang Liu ◽

Hui Yang ◽

Jiayi Li ◽

Simon Kolmanič

Keyword(s):

Short Term Memory ◽

Structured Data ◽

Entity Recognition ◽

Knowledge Graph ◽

Chinese History ◽

Entity Extraction ◽

Language Recognition ◽

Ancient History ◽

Relationship Extraction ◽

The Relationship

The domestic population has paid increasing attention to ancient Chinese history and culture with the continuous improvement of people’s living standards, the rapid economic growth, and the rapid advancement of information science and technology. The use of information technology has been proven to promote the spread and development of historical culture, and it is becoming a necessary means to promote our traditional culture. This paper will build a knowledge graph of ancient Chinese history and culture in order to facilitate the public to more quickly and accurately understand the relevant knowledge of ancient Chinese history and culture. The construction process is as follows: firstly, use crawler technology to obtain text and table data related to ancient history and culture on Baidu Encyclopedia (similar to Wikipedia) and ancient Chinese history and culture related pages. Among them, the crawler technology crawls the semi-structured data in the information box (InfoBox) in the Baidu Encyclopedia to directly construct the triples required for the knowledge graph, crawls the introductory text information of the entries in Baidu Encyclopedia, and specialized historical and cultural websites (history Chunqiu.com, On History.com) to extract unstructured entities and relationships. Secondly, entity recognition and relationship extraction are performed on an unstructured text. The entity recognition part uses the Bidirectional Long Short-Term Memory-Convolutional Neural Networks-Conditions Random Field (BiLSTM-CNN-CRF) model for entity extraction. The relationship extraction between entities is performed by using the open source tool DeepKE (information extraction tool with language recognition ability developed by Zhejiang University) to extract the relationships between entities. After obtaining the entity and the relationship between the entities, supplement it with the triple data that were constructed from the semi-structured data in the existing knowledge base and Baidu Encyclopedia information box. Subsequently, the ontology construction and the quality evaluation of the entire constructed knowledge graph are performed to form the final knowledge graph of ancient Chinese history and culture.

Download Full-text

Construction and Application of the User Behavior Knowledge Graph in Software Platforms

Journal of Web Engineering ◽

10.13052/jwe1540-9589.2027 ◽

2021 ◽

Author(s):

Fuhua Shang ◽

Qiuyu Ding ◽

Ruishan Du ◽

Maojun Cao ◽

Huanyu Chen

Keyword(s):

User Behavior ◽

Knowledge Graph ◽

Good Representation ◽

Entity Extraction ◽

Software Platform ◽

Relationship Extraction ◽

Software Platforms ◽

User Knowledge ◽

Software Bug

The analysis of user behavior provides a large amount of useful information. After being extracted, this information is called user knowledge. User knowledge plays a guiding role in implementing user-centric updates for software platforms. A good representation and application of user knowledge can accelerate the development of a software platform and improve its quality. This paper aims to further the utilization of user knowledge by mining the user knowledge that is implicit in user behavior and then constructing a knowledge graph of this behavior. First, the association between a software bug and a software component is mined from the user knowledge. Then, the knowledge entity extraction and relationship extraction are performed from the development code and the user behavior. Finally, the knowledge is stored in the graph database, from which it can be visually retrieved. Relevant experiments on CIFLog, an integrated logging processing software platform, have proved the effectiveness of this research. Constructing a user behavior knowledge graph can improve the utilization of user knowledge as well as the quality of software platform development.

Download Full-text

Research on the Construction of a Knowledge Graph and Knowledge Reasoning Model in the Field of Urban Traffic

Sustainability ◽

10.3390/su13063191 ◽

2021 ◽

Vol 13 (6) ◽

pp. 3191

Author(s):

Jiyuan Tan ◽

Qianqian Qiu ◽

Weiwei Guo ◽

Tingshuai Li

Keyword(s):

Representation Learning ◽

Urban Traffic ◽

Knowledge Graph ◽

Graph Database ◽

Traffic Data ◽

Data Set ◽

Traffic System ◽

Depth Data ◽

Knowledge Reasoning ◽

Model Layer

The integration of multi-source transportation data is complex and insufficient in most of the big cities, which made it difficult for researchers to conduct in-depth data mining to improve the policy or the management. In order to solve this problem, a top-down approach is used to construct a knowledge graph of urban traffic system in this paper. First, the model layer of the knowledge graph was used to realize the reuse and sharing of knowledge. Furthermore, the model layer then was stored in the graph database Neo4j. Second, the representation learning based knowledge reasoning model was adopted to implement knowledge completion and improve the knowledge graph. Finally, the proposed method was validated with an urban traffic data set and the results showed that the model could be used to mine the implicit relationship between traffic entities and discover traffic knowledge effectively.

Download Full-text

Toward a Coronavirus Knowledge Graph

Genes ◽

10.3390/genes12070998 ◽

2021 ◽

Vol 12 (7) ◽

pp. 998

Author(s):

Peng Zhang ◽

Yi Bu ◽

Peng Jiang ◽

Xiaowen Shi ◽

Bing Lun ◽

...

Keyword(s):

Drug Discovery ◽

Angiotensin Converting Enzyme ◽

Information Sources ◽

Knowledge Graph ◽

Converting Enzyme ◽

Related Information ◽

Relationship Types ◽

Entity Disambiguation ◽

Public Datasets ◽

Biological Entities

This study builds a coronavirus knowledge graph (KG) by merging two information sources. The first source is Analytical Graph (AG), which integrates more than 20 different public datasets related to drug discovery. The second source is CORD-19, a collection of published scientific articles related to COVID-19. We combined both chemo genomic entities in AG with entities extracted from CORD-19 to expand knowledge in the COVID-19 domain. Before populating KG with those entities, we perform entity disambiguation on CORD-19 collections using Wikidata. Our newly built KG contains at least 21,700 genes, 2500 diseases, 94,000 phenotypes, and other biological entities (e.g., compound, species, and cell lines). We define 27 relationship types and use them to label each edge in our KG. This research presents two cases to evaluate the KG’s usability: analyzing a subgraph (ego-centered network) from the angiotensin-converting enzyme (ACE) and revealing paths between biological entities (hydroxychloroquine and IL-6 receptor; chloroquine and STAT1). The ego-centered network captured information related to COVID-19. We also found significant COVID-19-related information in top-ranked paths with a depth of three based on our path evaluation.

Download Full-text

Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018)

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18031333 ◽

2021 ◽

Vol 18 (3) ◽

pp. 1333

Author(s):

Ahmad R. Alsaber ◽

Jiazhu Pan ◽

Adeeba Al-Hurban

Keyword(s):

Air Quality ◽

Missing Data ◽

Random Forest ◽

Missing Values ◽

Imputation Method ◽

Environmental Data ◽

Environmental Research ◽

Quality Data ◽

Data Set ◽

Air Quality Data

In environmental research, missing data are often a challenge for statistical modeling. This paper addressed some advanced techniques to deal with missing values in a data set measuring air quality using a multiple imputation (MI) approach. MCAR, MAR, and NMAR missing data techniques are applied to the data set. Five missing data levels are considered: 5%, 10%, 20%, 30%, and 40%. The imputation method used in this paper is an iterative imputation method, missForest, which is related to the random forest approach. Air quality data sets were gathered from five monitoring stations in Kuwait, aggregated to a daily basis. Logarithm transformation was carried out for all pollutant data, in order to normalize their distributions and to minimize skewness. We found high levels of missing values for NO2 (18.4%), CO (18.5%), PM10 (57.4%), SO2 (19.0%), and O3 (18.2%) data. Climatological data (i.e., air temperature, relative humidity, wind direction, and wind speed) were used as control variables for better estimation. The results show that the MAR technique had the lowest RMSE and MAE. We conclude that MI using the missForest approach has a high level of accuracy in estimating missing values. MissForest had the lowest imputation error (RMSE and MAE) among the other imputation methods and, thus, can be considered to be appropriate for analyzing air quality data.

Download Full-text

A Benchmark and Evaluation of Non-Rigid Structure from Motion

International Journal of Computer Vision ◽

10.1007/s11263-020-01406-y ◽

2020 ◽

Author(s):

Sebastian Hoppe Nesgaard Jensen ◽

Mads Emil Brix Doest ◽

Henrik Aanæs ◽

Alessio Del Bue

Keyword(s):

Computer Vision ◽

Structure From Motion ◽

State Of The Art ◽

The State ◽

Quality Data ◽

Data Set ◽

Rigid Structure ◽

Public Data ◽

3D Information ◽

Further Development

AbstractNon-rigid structure from motion (nrsfm), is a long standing and central problem in computer vision and its solution is necessary for obtaining 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting a data set created for this purpose, which is made publicly available, and considerably larger than the previous state of the art. To validate the applicability of this data set, and provide an investigation into the state of the art of nrsfm, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 18 different methods with available code that reasonably spans the state of the art in sparse nrsfm. This new public data set and evaluation protocol will provide benchmark tools for further development in this challenging field.

Download Full-text

Knowledge Reasoning Method for Military Decision Support Knowledge Graph Mixing Rule and Graph Neural Networks Learning together

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9327031 ◽

2020 ◽

Author(s):

Kai Nie ◽

Kejun Zeng ◽

Qinghai Meng

Keyword(s):

Neural Networks ◽

Decision Support ◽

Knowledge Graph ◽

Mixing Rule ◽

Knowledge Reasoning ◽

Graph Neural Networks ◽

Learning Together

Download Full-text

Research of Personalized Recommendation Technology Based on Knowledge Graphs

Applied Sciences ◽

10.3390/app11157104 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7104

Author(s):

Xu Yang ◽

Ziyi Huan ◽

Yisong Zhai ◽

Ting Lin

Keyword(s):

Neural Network ◽

Hot Spot ◽

Experimental Results ◽

Graph Representation ◽

Personalized Recommendation ◽

Knowledge Graph ◽

Learning Technology ◽

Data Set ◽

Model Based ◽

Knowledge Graphs

Nowadays, personalized recommendation based on knowledge graphs has become a hot spot for researchers due to its good recommendation effect. In this paper, we researched personalized recommendation based on knowledge graphs. First of all, we study the knowledge graphs’ construction method and complete the construction of the movie knowledge graphs. Furthermore, we use Neo4j graph database to store the movie data and vividly display it. Then, the classical translation model TransE algorithm in knowledge graph representation learning technology is studied in this paper, and we improved the algorithm through a cross-training method by using the information of the neighboring feature structures of the entities in the knowledge graph. Furthermore, the negative sampling process of TransE algorithm is improved. The experimental results show that the improved TransE model can more accurately vectorize entities and relations. Finally, this paper constructs a recommendation model by combining knowledge graphs with ranking learning and neural network. We propose the Bayesian personalized recommendation model based on knowledge graphs (KG-BPR) and the neural network recommendation model based on knowledge graphs(KG-NN). The semantic information of entities and relations in knowledge graphs is embedded into vector space by using improved TransE method, and we compare the results. The item entity vectors containing external knowledge information are integrated into the BPR model and neural network, respectively, which make up for the lack of knowledge information of the item itself. Finally, the experimental analysis is carried out on MovieLens-1M data set. The experimental results show that the two recommendation models proposed in this paper can effectively improve the accuracy, recall, F1 value and MAP value of recommendation.

Download Full-text

Construction of Therapy-Disease Knowledge Graph (TDKG) Based on Entity Relationship Extraction

10.1109/aemcse51986.2021.00173 ◽

2021 ◽

Author(s):

Haohua Wang ◽

Aiyu Wang ◽

Fangfang Su ◽

Honghai Feng ◽

Yanyan Chen

Keyword(s):

Knowledge Graph ◽

Disease Knowledge ◽

Relationship Extraction ◽

Entity Relationship

Download Full-text

Volumetric Velocimetry Measurements of Film Cooling Jets

Journal of Engineering for Gas Turbines and Power ◽

10.1115/1.4041206 ◽

2018 ◽

Vol 141 (3) ◽

Author(s):

Artur Joao Carvalho Figueiredo ◽

Robin Jones ◽

Oliver J. Pountney ◽

James A. Scobie ◽

Gary D. Lock ◽

...

Keyword(s):

Film Cooling ◽

Momentum Flux ◽

Three Dimensional ◽

Flux Ratio ◽

Quality Data ◽

Data Set ◽

Momentum Flux Ratio ◽

Jet In Crossflow ◽

Lift Off ◽

Induced Velocity

This paper presents volumetric velocimetry (VV) measurements for a jet in crossflow that is representative of film cooling. VV employs particle tracking to nonintrusively extract all three components of velocity in a three-dimensional volume. This is its first use in a film-cooling context. The primary research objective was to develop this novel measurement technique for turbomachinery applications, while collecting a high-quality data set that can improve the understanding of the flow structure of the cooling jet. A new facility was designed and manufactured for this study with emphasis on optical access and controlled boundary conditions. For a range of momentum flux ratios from 0.65 to 6.5, the measurements clearly show the penetration of the cooling jet into the freestream, the formation of kidney-shaped vortices, and entrainment of main flow into the jet. The results are compared to published studies using different experimental techniques, with good agreement. Further quantitative analysis of the location of the kidney vortices demonstrates their lift off from the wall and increasing lateral separation with increasing momentum flux ratio. The lateral divergence correlates very well with the self-induced velocity created by the wall–vortex interaction. Circulation measurements quantify the initial roll up and decay of the kidney vortices and show that the point of maximum circulation moves downstream with increasing momentum flux ratio. The potential for nonintrusive VV measurements in turbomachinery flow has been clearly demonstrated.

Download Full-text