Russian-English dataset and comparative analysis of algorithms for cross-language embedding-based entity alignment
Abstract The problem of data fusion from data bases and knowledge graphs in different languages is becoming increasingly important. The main step of such a fusion is the identification of equivalent entities in different knowledge graphs and merging their descriptions. This problem is known as the identity resolution, or entity alignment problem. Recently, a large group of new entity alignment methods has emerged. They look for the so called “embeddings” of entities and establish the equivalence of entities by comparing their embeddings. This paper presents experiments with embedding-based entity alignment algorithms on a Russian-English dataset. The purpose of this work is to identify language-specific features of the entity alignment algorithms. Also, future directions of research are outlined.