scholarly journals Intelligent Learning for Knowledge Graph towards Geological Data

2017 ◽  
Vol 2017 ◽  
pp. 1-13 ◽  
Author(s):  
Yueqin Zhu ◽  
Wenwen Zhou ◽  
Yang Xu ◽  
Ji Liu ◽  
Yongjie Tan

Knowledge graph (KG) as a popular semantic network has been widely used. It provides an effective way to describe semantic entities and their relationships by extending ontology in the entity level. This article focuses on the application of KG in the traditional geological field and proposes a novel method to construct KG. On the basis of natural language processing (NLP) and data mining (DM) algorithms, we analyze those key technologies for designing a KG towards geological data, including geological knowledge extraction and semantic association. Through this typical geological ontology extracting on a large number of geological documents and open linked data, the semantic interconnection is achieved, KG framework for geological data is designed, application system of KG towards geological data is constructed, and dynamic updating of the geological information is completed accordingly. Specifically, unsupervised intelligent learning method using linked open data is incorporated into the geological document preprocessing, which generates a geological domain vocabulary ultimately. Furthermore, some application cases in the KG system are provided to show the effectiveness and efficiency of our proposed intelligent learning approach for KG.

Semantic Web ◽  
2021 ◽  
pp. 1-27
Author(s):  
Ahmet Soylu ◽  
Oscar Corcho ◽  
Brian Elvesæter ◽  
Carlos Badenes-Olmedo ◽  
Tom Blount ◽  
...  

Public procurement is a large market affecting almost every organisation and individual; therefore, governments need to ensure its efficiency, transparency, and accountability, while creating healthy, competitive, and vibrant economies. In this context, open data initiatives and integration of data from multiple sources across national borders could transform the procurement market by such as lowering the barriers of entry for smaller suppliers and encouraging healthier competition, in particular by enabling cross-border bids. Increasingly more open data is published in the public sector; however, these are created and maintained in siloes and are not straightforward to reuse or maintain because of technical heterogeneity, lack of quality, insufficient metadata, or missing links to related domains. To this end, we developed an open linked data platform, called TheyBuyForYou, consisting of a set of modular APIs and ontologies to publish, curate, integrate, analyse, and visualise an EU-wide, cross-border, and cross-lingual procurement knowledge graph. We developed advanced tools and services on top of the knowledge graph for anomaly detection, cross-lingual document search, and data storytelling. This article describes the TheyBuyForYou platform and knowledge graph, reports their adoption by different stakeholders and challenges and experiences we went through while creating them, and demonstrates the usefulness of Semantic Web and Linked Data technologies for enhancing public procurement.


2018 ◽  
Vol 12 (4) ◽  
pp. 06-10
Author(s):  
Daniel Martínez Ávila ◽  
Richard P. Smiraglia ◽  
Rick Szostak ◽  
Andrea Scharnhorst ◽  
Wouter Beek ◽  
...  

Massive amounts of data from different contexts and producers are collected and connected relying often solely on statistical techniques. Problems to the acclaimed value of data lie in the precise definition of data and associated contexts as well as the problem that data are not always published in meaningful and open ways. The Linked Data paradigm offers a solution to the limitations of simple keywords by having unique, resolvable and shared identifiers instead of strings This paper reports on a three-year research project “Digging Into the Knowledge Graph,” funded as part of the 2016 Round Four Digging Into Data Challenge (https://diggingintodata.org/awards/2016/project/digging-knowledge-graph). Our project involves comparing terminology employed within the LOD cloud with terminology employed within two general but different KOSs – Universal Decimal Classification and Basic Concepts Classification. We are exploring whether these classifications can encourage greater consistency in LOD terminology and linking the largely distinct scholarly literatures that address LOD and KOSs. Our project is an attempt to connect the Linked Open Data community, which has tended to be centered in computer science, and the KO community, with members from linguistics, metaphysics, library and information science. We focus on the shared challenges related to Big Data between both communities.


2017 ◽  
Vol 108 (1) ◽  
pp. 355-366 ◽  
Author(s):  
Ankit Srivastava ◽  
Georg Rehm ◽  
Felix Sasaki

Abstract With the ever increasing availability of linked multilingual lexical resources, there is a renewed interest in extending Natural Language Processing (NLP) applications so that they can make use of the vast set of lexical knowledge bases available in the Semantic Web. In the case of Machine Translation, MT systems can potentially benefit from such a resource. Unknown words and ambiguous translations are among the most common sources of error. In this paper, we attempt to minimise these types of errors by interfacing Statistical Machine Translation (SMT) models with Linked Open Data (LOD) resources such as DBpedia and BabelNet. We perform several experiments based on the SMT system Moses and evaluate multiple strategies for exploiting knowledge from multilingual linked data in automatically translating named entities. We conclude with an analysis of best practices for multilingual linked data sets in order to optimise their benefit to multilingual and cross-lingual applications.


2021 ◽  
Vol 26 (2) ◽  
pp. 143-149
Author(s):  
Abdelghani Bouziane ◽  
Djelloul Bouchiha ◽  
Redha Rebhi ◽  
Giulio Lorenzini ◽  
Noureddine Doumi ◽  
...  

The evolution of the traditional Web into the semantic Web makes the machine a first-class citizen on the Web and increases the discovery and accessibility of unstructured Web-based data. This development makes it possible to use Linked Data technology as the background knowledge base for unstructured data, especially texts, now available in massive quantities on the Web. Given any text, the main challenge is determining DBpedia's most relevant information with minimal effort and time. Although, DBpedia annotation tools, such as DBpedia spotlight, mainly targeted English and Latin DBpedia versions. The current situation of the Arabic language is less bright; the Web content of the Arabic language does not reflect the importance of this language. Thus, we have developed an approach to annotate Arabic texts with Linked Open Data, particularly DBpedia. This approach uses natural language processing and machine learning techniques for interlinking Arabic text with Linked Open Data. Despite the high complexity of the independent domain knowledge base and the reduced resources in Arabic natural language processing, the evaluation results of our approach were encouraging.


2020 ◽  
Vol 23 (4) ◽  
pp. 808-834
Author(s):  
Olga Avenirovna Nevzorova

This review discusses methods and algorithms for increasing linked data expressiveness which are prepared for Web publication. The main approaches to the enrichment of ontologies are considered, the methods on which they are based and the tools for implementing the corresponding methods are described.The main stage in the general scheme of the related data life cycle in a cloud of Linked Open Data is the stage of building a set of related RDF- triples. To improve the classification of data and the analysis of their quality, various methods are used to increase the expressiveness of related data. The main ideas of these methods are concerned with the enrichment of existing ontologies (an expansion of the basic scheme of knowledge) by adding or improving terminological axioms. Enrichment methods are based on methods used in various fields, such as knowledge representation, machine learning, statistics, natural language processing, analysis of formal concepts, and game theory.


Author(s):  
Caio Saraiva Coneglian ◽  
José Eduardo Santarem Segundo

O surgimento de novas tecnologias, tem introduzido meios para a divulgação e a disponibilização das informações mais eficientemente. Uma iniciativa, chamada de Europeana, vem promovendo esta adaptação dos objetos informacionais dentro da Web, e mais especificamente no Linked Data. Desta forma, o presente estudo tem como objetivo apresentar uma discussão acerca da relação entre as Humanidades Digitais e o Linked Open Data, na figura da Europeana. Para tal, utilizamos uma metodologia exploratória e que busca explorar as questões relacionadas ao modelo de dados da Europeana, EDM, por meio do SPARQL. Como resultados, compreendemos as características do EDM, pela utilização do SPARQL. Identificamos, ainda, a importância que o conceito de Humanidades Digitais possui dentro do contexto da Europeana.Palavras-chave: Web semântica. Linked open data. Humanidades digitais. Europeana. EDM.Link: https://periodicos.ufsc.br/index.php/eb/article/view/1518-2924.2017v22n48p88/33031


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 664
Author(s):  
Nikos Kanakaris ◽  
Nikolaos Giarelis ◽  
Ilias Siachos ◽  
Nikos Karacapilidis

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.


Author(s):  
Shumin Shi ◽  
Dan Luo ◽  
Xing Wu ◽  
Congjun Long ◽  
Heyan Huang

Dependency parsing is an important task for Natural Language Processing (NLP). However, a mature parser requires a large treebank for training, which is still extremely costly to create. Tibetan is a kind of extremely low-resource language for NLP, there is no available Tibetan dependency treebank, which is currently obtained by manual annotation. Furthermore, there are few related kinds of research on the construction of treebank. We propose a novel method of multi-level chunk-based syntactic parsing to complete constituent-to-dependency treebank conversion for Tibetan under scarce conditions. Our method mines more dependencies of Tibetan sentences, builds a high-quality Tibetan dependency tree corpus, and makes fuller use of the inherent laws of the language itself. We train the dependency parsing models on the dependency treebank obtained by the preliminary transformation. The model achieves 86.5% accuracy, 96% LAS, and 97.85% UAS, which exceeds the optimal results of existing conversion methods. The experimental results show that our method has the potential to use a low-resource setting, which means we not only solve the problem of scarce Tibetan dependency treebank but also avoid needless manual annotation. The method embodies the regularity of strong knowledge-guided linguistic analysis methods, which is of great significance to promote the research of Tibetan information processing.


2021 ◽  
Vol 13 (5) ◽  
pp. 124
Author(s):  
Jiseong Son ◽  
Chul-Su Lim ◽  
Hyoung-Seop Shim ◽  
Ji-Sun Kang

Despite the development of various technologies and systems using artificial intelligence (AI) to solve problems related to disasters, difficult challenges are still being encountered. Data are the foundation to solving diverse disaster problems using AI, big data analysis, and so on. Therefore, we must focus on these various data. Disaster data depend on the domain by disaster type and include heterogeneous data and lack interoperability. In particular, in the case of open data related to disasters, there are several issues, where the source and format of data are different because various data are collected by different organizations. Moreover, the vocabularies used for each domain are inconsistent. This study proposes a knowledge graph to resolve the heterogeneity among various disaster data and provide interoperability among domains. Among disaster domains, we describe the knowledge graph for flooding disasters using Korean open datasets and cross-domain knowledge graphs. Furthermore, the proposed knowledge graph is used to assist, solve, and manage disaster problems.


Author(s):  
Ane Bang-Kittilsen ◽  
Terje Midtbø

AbstractGeologists struggle to communicate the uncertainty that arise when mapping and interpreting the geological subsurface. Today, open data sharing policies make new value of geological information possible for a broader user group of non-experts. It is crucial to develop standard methods for visualizing uncertainty to increase the usability of geological information. In this study, a web experiment was set up to analyze whether and how different design choices influence the sense of uncertainty. Also, questions about the intuitiveness of symbols were asked. Two-hundred ten participants from different countries completed the experiment, both experts and non-experts in geology. Traditional visualization techniques in geology, like dashed lines, dotted lines and question mark, were tested. In addition, other visualizations were tested, such as hatched area and variations of symbol size, zoom levels and reference information. The results show that design choices have an impact on the participants’ assessment of uncertainty. The experts inquire about crucial information if it is not present. The results also suggest that when visualizing uncertainty, all the elements in the representation, and specifically the line and area symbols that delineate and colour the features, must work together to make the right impression.


Sign in / Sign up

Export Citation Format

Share Document