scholarly journals Design of Open Knowledge Platform Based On Knowledge Base Utilization Model And Service Scenario To Support Solutions Of Regional Issues

Author(s):  
ChulSu Lim Et.al

Open knowledge platform can provide a purified knowledge base. Thus, we build a platform for several application areas in a cloud computing that supports APIs for various data based on a knowledge utilization model. The goal of this platform is to maximize the utilization of the knowledge base. In order to achieve this goal, we designed the structure of this platform as an open knowledge platform. The targets of the design are to maximize the utilization of data linkage, to expand it to national common knowledge and to increase its usability by providing services with knowledge graphs. In order to design the platform we identified users, information sources, and infrastructures. In the process, we found it is crucial to specify roles and services to the users of the platform. The requirements are induced from a utilization model and scenario of the service based on the knowledge graph. With the service scenario, stakeholders of the platform started narrow down function modules needed to support the service. One of the modules is a national common knowledge in the knowledge base, which provide essential connected knowledge to support solving regional problems of government such as earthquake, flooding. To increase the usability of data scattered by departments and agencies, data linkage, and knowledge between fragmented data sets is included in this platform.  Subsequently, we designed modules to support the effective utilization of this knowledge information. Also, we found that a cloud infrastructure instead of in-house hardware and software could provide flexible and compatible services for the platform. Moreover, the cloud system has advantages on big data analysis and distributed system interconnection. Utilization model and scenario-based process modeling provide a systematic approach to design an open knowledge platform that supports many required components enabling interoperability, compatibility, and connectivity among other knowledge bases..

Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


2020 ◽  
Vol 10 (8) ◽  
pp. 2651
Author(s):  
Su Jeong Choi ◽  
Hyun-Je Song ◽  
Seong-Bae Park

Knowledge bases such as Freebase, YAGO, DBPedia, and Nell contain a number of facts with various entities and relations. Since they store many facts, they are regarded as core resources for many natural language processing tasks. Nevertheless, they are not normally complete and have many missing facts. Such missing facts keep them from being used in diverse applications in spite of their usefulness. Therefore, it is significant to complete knowledge bases. Knowledge graph embedding is one of the promising approaches to completing a knowledge base and thus many variants of knowledge graph embedding have been proposed. It maps all entities and relations in knowledge base onto a low dimensional vector space. Then, candidate facts that are plausible in the space are determined as missing facts. However, any single knowledge graph embedding is insufficient to complete a knowledge base. As a solution to this problem, this paper defines knowledge base completion as a ranking task and proposes a committee-based knowledge graph embedding model for improving the performance of knowledge base completion. Since each knowledge graph embedding has its own idiosyncrasy, we make up a committee of various knowledge graph embeddings to reflect various perspectives. After ranking all candidate facts according to their plausibility computed by the committee, the top-k facts are chosen as missing facts. Our experimental results on two data sets show that the proposed model achieves higher performance than any single knowledge graph embedding and shows robust performances regardless of k. These results prove that the proposed model considers various perspectives in measuring the plausibility of candidate facts.


Information ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 463
Author(s):  
Antonia Azzini ◽  
Nicola Cortesi ◽  
Giuseppe Psaila

Many organizations must produce many reports for various reasons. Although this activity could appear simple to carry out, this fact is not at all true: indeed, generating reports requires the collection of possibly large and heterogeneous data sets. Furthermore, different professional figures are involved in the process, possibly with different skills (database technicians, domain experts, employees): the lack of common knowledge and of a unifying framework significantly obstructs the effective and efficient definition and continuous generation of reports. This paper presents a novel framework named RADAR, which is the acronym for “Resilient Application for Dependable Aided Reporting”: the framework has been devised to be a ”bridge” between data and employees in charge of generating reports. Specifically, it builds a common knowledge base in which database administrators and domain experts describe their knowledge about the application domain and the gathered data; this knowledge can be browsed by employees to find out the relevant data to aggregate and insert into reports, while designing report layouts; the framework assists the overall process from data definition to report generation. The paper presents the application scenario and the vision by means of a running example, defines the data model and presents the architecture of the framework.


Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


2020 ◽  
Author(s):  
Matheus Pereira Lobo

This paper is about highlighting two categories of knowledge bases, one built as a repository of links, and other based on units of knowledge.


2018 ◽  
Vol 2 ◽  
pp. e25614 ◽  
Author(s):  
Florian Pellen ◽  
Sylvain Bouquin ◽  
Isabelle Mougenot ◽  
Régine Vignes-Lebbe

Xper3 (Vignes Lebbe et al. 2016) is a collaborative knowledge base publishing platform that, since its launch in november 2013, has been adopted by over 2 thousand users (Pinel et al. 2017). This is mainly due to its user friendly interface and the simplicity of its data model. The data are stored in MySQL Relational DBs, but the exchange format uses the TDWG standard format SDD (Structured Descriptive DataHagedorn et al. 2005). However, each Xper3 knowledge base is a closed world that the author(s) may or may not share with the scientific community or the public via publishing content and/or identification key (Kopfstein 2016). The explicit taxonomic, geographic and phenotypic limits of a knowledge base are not always well defined in the metadata fields. Conversely terminology vocabularies, such as Phenotype and Trait Ontology PATO and the Plant Ontology PO, and software to edit them, such as Protégé and Phenoscape, are essential in the semantic web, but difficult to handle for biologist without computer skills. These ontologies constitute open worlds, and are expressed themselves by RDF triples (Resource Description Framework). Protégé offers vizualisation and reasoning capabilities for these ontologies (Gennari et al. 2003, Musen 2015). Our challenge is to combine the user friendliness of Xper3 with the expressive power of OWL (Web Ontology Language), the W3C standard for building ontologies. We therefore focused on analyzing the representation of the same taxonomic contents under Xper3 and under different models in OWL. After this critical analysis, we chose a description model that allows automatic export of SDD to OWL and can be easily enriched. We will present the results obtained and their validation on two knowledge bases, one on parasitic crustaceans (Sacculina) and the second on current ferns and fossils (Corvez and Grand 2014). The evolution of the Xper3 platform and the perspectives offered by this link with semantic web standards will be discussed.


Author(s):  
Yongrui Chen ◽  
Huiying Li ◽  
Yuncheng Hua ◽  
Guilin Qi

Formal query building is an important part of complex question answering over knowledge bases. It aims to build correct executable queries for questions. Recent methods try to rank candidate queries generated by a state-transition strategy. However, this candidate generation strategy ignores the structure of queries, resulting in a considerable number of noisy queries. In this paper, we propose a new formal query building approach that consists of two stages. In the first stage, we predict the query structure of the question and leverage the structure to constrain the generation of the candidate queries. We propose a novel graph generation framework to handle the structure prediction task and design an encoder-decoder model to predict the argument of the predetermined operation in each generative step. In the second stage, we follow the previous methods to rank the candidate queries. The experimental results show that our formal query building approach outperforms existing methods on complex questions while staying competitive on simple questions.


Author(s):  
Ian Olver

IntroductionData linkage of population data sets often across jurisdictions or linking health data sets or health data with non-health data often involves balancing ethical principles such as privacy with beneficence as represented by the public good. Similar ethical dilemmas occur in health resource allocation decisions. The NHMRC have published a framework to guide policy on health resource allocation decisions that could be applied to ensure the justification of data linkage projects that is defensible as in the interest of the public good. Objectives and ApproachThe four main conditions for legitimacy of policy decisions about access to healthcare in a democracy with a public health system and limited resources wereexamined for their relevance to decisions about the use of public data and linking data sets. ResultsPublic policy decisions must be defensible and responsive to the interests of those affected. Decision-makers should articulate their reasoning and recommendations so that citizens can judge them. While the context of policy decisions will differ, their legitimacy depends upon (1) the transparency of the reasoning which should be free from conflicts of interest, the basis for decisions recorded and report widely, (2) the accountability of the decision-makers to the wider community, (3) the testability of the evidence used to inform the decision-making, which usually means that it will stand up to independent review and(4) the inclusive recognition of those the decision affects which often requires that the implications for disadvantaged groups are considered, even if they can’t always be accommodated. These conditions are interrelated but ensure that the good of society in general and not just specific dominant groups are accommodated. Conclusion / ImplicationsIt these principles are applied to decisions about data linkage projects they have clear applicability in society accepting data linkage projects having balanced the good against the ethical risks involved.


2016 ◽  
Vol 31 (2) ◽  
pp. 97-123 ◽  
Author(s):  
Alfred Krzywicki ◽  
Wayne Wobcke ◽  
Michael Bain ◽  
John Calvo Martinez ◽  
Paul Compton

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Alexander Falenski ◽  
Armin A. Weiser ◽  
Christian Thöns ◽  
Bernd Appel ◽  
Annemarie Käsbohrer ◽  
...  

In case of contamination in the food chain, fast action is required in order to reduce the numbers of affected people. In such situations, being able to predict the fate of agents in foods would help risk assessors and decision makers in assessing the potential effects of a specific contamination event and thus enable them to deduce the appropriate mitigation measures. One efficient strategy supporting this is using model based simulations. However, application in crisis situations requires ready-to-use and easy-to-adapt models to be available from the so-called food safety knowledge bases. Here, we illustrate this concept and its benefits by applying the modular open source software tools PMM-Lab and FoodProcess-Lab. As a fictitious sample scenario, an intentional ricin contamination at a beef salami production facility was modelled. Predictive models describing the inactivation of ricin were reviewed, relevant models were implemented with PMM-Lab, and simulations on residual toxin amounts in the final product were performed with FoodProcess-Lab. Due to the generic and modular modelling concept implemented in these tools, they can be applied to simulate virtually any food safety contamination scenario. Apart from the application in crisis situations, the food safety knowledge base concept will also be useful in food quality and safety investigations.


Sign in / Sign up

Export Citation Format

Share Document