scholarly journals Progress and Challenges on Entity Alignment of Geographic Knowledge Bases

2019 ◽  
Vol 8 (2) ◽  
pp. 77 ◽  
Author(s):  
Kai Sun ◽  
Yunqiang Zhu ◽  
Jia Song

Geographic knowledge bases (GKBs) with multiple sources and forms are of obvious heterogeneity, which hinders the integration of geographic knowledge. Entity alignment provides an effective way to find correspondences of entities by measuring the multidimensional similarity between entities from different GKBs, thereby overcoming the semantic gap. Thus, many efforts have been made in this field. This paper initially proposes basic definitions and a general framework for the entity alignment of GKBs. Specifically, the state-of-the-art of algorithms of entity alignment of GKBs is reviewed from the three aspects of similarity metrics, similarity combination, and alignment judgement; the evaluation procedure of alignment results is also summarized. On this basis, eight challenges for future studies are identified. There is a lack of methods to assess the qualities of GKBs. The alignment process should be improved by determining the best composition of heterogeneous features, optimizing alignment algorithms, and incorporating background knowledge. Furthermore, a unified infrastructure, techniques for aligning large-scale GKBs, and deep learning-based alignment techniques should be developed. Meanwhile, the generation of benchmark datasets for the entity alignment of GKBs and the applications of this field need to be investigated. The progress of this field will be accelerated by addressing these challenges.

2023 ◽  
Vol 55 (1) ◽  
pp. 1-39
Author(s):  
Thanh Tuan Nguyen ◽  
Thanh Phuong Nguyen

Representing dynamic textures (DTs) plays an important role in many real implementations in the computer vision community. Due to the turbulent and non-directional motions of DTs along with the negative impacts of different factors (e.g., environmental changes, noise, illumination, etc.), efficiently analyzing DTs has raised considerable challenges for the state-of-the-art approaches. For 20 years, many different techniques have been introduced to handle the above well-known issues for enhancing the performance. Those methods have shown valuable contributions, but the problems have been incompletely dealt with, particularly recognizing DTs on large-scale datasets. In this article, we present a comprehensive taxonomy of DT representation in order to purposefully give a thorough overview of the existing methods along with overall evaluations of their obtained performances. Accordingly, we arrange the methods into six canonical categories. Each of them is then taken in a brief presentation of its principal methodology stream and various related variants. The effectiveness levels of the state-of-the-art methods are then investigated and thoroughly discussed with respect to quantitative and qualitative evaluations in classifying DTs on benchmark datasets. Finally, we point out several potential applications and the remaining challenges that should be addressed in further directions. In comparison with two existing shallow DT surveys (i.e., the first one is out of date as it was made in 2005, while the newer one (published in 2016) is an inadequate overview), we believe that our proposed comprehensive taxonomy not only provides a better view of DT representation for the target readers but also stimulates future research activities.


Semantic Web ◽  
2021 ◽  
pp. 1-25
Author(s):  
Jiaoyan Chen ◽  
Ernesto Jiménez-Ruiz ◽  
Ian Horrocks ◽  
Xi Chen ◽  
Erik Bryhn Myklebust

Various knowledge bases (KBs) have been constructed via information extraction from encyclopedias, text and tables, as well as alignment of multiple sources. Their usefulness and usability is often limited by quality issues. One common issue is the presence of erroneous assertions and alignments, often caused by lexical or semantic confusion. We study the problem of correcting such assertions and alignments, and present a general correction framework which combines lexical matching, context-aware sub-KB extraction, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated with one set of literal assertions from DBpedia, one set of entity assertions from an enterprise medical KB, and one set of mapping assertions from a music KB constructed by integrating Wikidata, Discogs and MusicBrainz. It has achieved promising results, with a correction rate (i.e., the ratio of the target assertions/alignments that are corrected with right substitutes) of 70.1 %, 60.9 % and 71.8 %, respectively.


Author(s):  
Tommaso Pasini

Word Sense Disambiguation (WSD) is the task of identifying the meaning of a word in a given context. It lies at the base of Natural Language Processing as it provides semantic information for words. In the last decade, great strides have been made in this field and much effort has been devoted to mitigate the knowledge acquisition bottleneck problem, i.e., the problem of semantically annotating texts at a large scale and in different languages. This issue is ubiquitous in WSD as it hinders the creation of both multilingual knowledge bases and manually-curated training sets. In this work, we first introduce the reader to the task of WSD through a short historical digression and then take the stock of the advancements to alleviate the knowledge acquisition bottleneck problem. In that, we survey the literature on manual, semi-automatic and automatic approaches to create English and multilingual corpora tagged with sense annotations and present a clear overview over supervised models for WSD. Finally, we provide our view over the future directions that we foresee for the field.


Crystals ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 15
Author(s):  
Cheng-An Tao ◽  
Jian-Fang Wang

Metal-organic frameworks (MOFs) have been used in adsorption, separation, catalysis, sensing, photo/electro/magnetics, and biomedical fields because of their unique periodic pore structure and excellent properties and have become a hot research topic in recent years. Ball milling is a method of small pollution, short time-consumption, and large-scale synthesis of MOFs. In recent years, many important advances have been made. In this paper, the influencing factors of MOFs synthesized by grinding were reviewed systematically from four aspects: auxiliary additives, metal sources, organic linkers, and reaction specific conditions (such as frequency, reaction time, and mass ratio of ball and raw materials). The prospect for the future development of the synthesis of MOFs by grinding was proposed.


2020 ◽  
Vol 10 (1) ◽  
pp. 7
Author(s):  
Miguel R. Luaces ◽  
Jesús A. Fisteus ◽  
Luis Sánchez-Fernández ◽  
Mario Munoz-Organero ◽  
Jesús Balado ◽  
...  

Providing citizens with the ability to move around in an accessible way is a requirement for all cities today. However, modeling city infrastructures so that accessible routes can be computed is a challenge because it involves collecting information from multiple, large-scale and heterogeneous data sources. In this paper, we propose and validate the architecture of an information system that creates an accessibility data model for cities by ingesting data from different types of sources and provides an application that can be used by people with different abilities to compute accessible routes. The article describes the processes that allow building a network of pedestrian infrastructures from the OpenStreetMap information (i.e., sidewalks and pedestrian crossings), improving the network with information extracted obtained from mobile-sensed LiDAR data (i.e., ramps, steps, and pedestrian crossings), detecting obstacles using volunteered information collected from the hardware sensors of the mobile devices of the citizens (i.e., ramps and steps), and detecting accessibility problems with software sensors in social networks (i.e., Twitter). The information system is validated through its application in a case study in the city of Vigo (Spain).


Energies ◽  
2021 ◽  
Vol 14 (10) ◽  
pp. 2833
Author(s):  
Paolo Civiero ◽  
Jordi Pascual ◽  
Joaquim Arcas Abella ◽  
Ander Bilbao Figuero ◽  
Jaume Salom

In this paper, we provide a view of the ongoing PEDRERA project, whose main scope is to design a district simulation model able to set and analyze a reliable prediction of potential business scenarios on large scale retrofitting actions, and to evaluate the overall co-benefits resulting from the renovation process of a cluster of buildings. According to this purpose and to a Positive Energy Districts (PEDs) approach, the model combines systemized data—at both building and district scale—from multiple sources and domains. A sensitive analysis of 200 scenarios provided a quick perception on how results will change once inputs are defined, and how attended results will answer to stakeholders’ requirements. In order to enable a clever input analysis and to appraise wide-ranging ranks of Key Performance Indicators (KPIs) suited to each stakeholder and design phase targets, the model is currently under the implementation in the urbanZEB tool’s web platform.


2018 ◽  
Vol 5 (2) ◽  
Author(s):  
Matthieu J. S. Brinkhuis ◽  
Alexander O. Savi ◽  
Abe D. Hofman ◽  
Frederik Coomans ◽  
Han L. J. Van der Maas ◽  
...  

With the advent of computers in education, and the ample availability of online learning and practice environments, enormous amounts of data on learning become available. The purpose of this paper is to present a decade of experience with analyzing and improving an online practice environment for math, which has thus far recorded over a billion responses. We present the methods we use to both steer and analyze this system in real-time, using scoring rules on accuracy and response times, a tailored rating system to provide both learners and items with current ability and difficulty ratings, and an adaptive engine that matches learners to items. Moreover, we explore the quality of fit by means of prediction accuracy and parallel item reliability. Limitations and pitfalls are discussed by diagnosing sources of misfit, like violations of unidimensionality and unforeseen dynamics. Finally, directions for development are discussed, including embedded learning analytics and a focus on online experimentation to evaluate both the system itself and the users’ learning gains. Though many challenges remain open, we believe that large steps have been made in providing methods to efficiently manage and research educational big data from a massive online learning system.


Author(s):  
Siva Reddy ◽  
Mirella Lapata ◽  
Mark Steedman

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.


2019 ◽  
Vol 7 ◽  
Author(s):  
Brian Stucky ◽  
James Balhoff ◽  
Narayani Barve ◽  
Vijay Barve ◽  
Laura Brenskelle ◽  
...  

Insects are possibly the most taxonomically and ecologically diverse class of multicellular organisms on Earth. Consequently, they provide nearly unlimited opportunities to develop and test ecological and evolutionary hypotheses. Currently, however, large-scale studies of insect ecology, behavior, and trait evolution are impeded by the difficulty in obtaining and analyzing data derived from natural history observations of insects. These data are typically highly heterogeneous and widely scattered among many sources, which makes developing robust information systems to aggregate and disseminate them a significant challenge. As a step towards this goal, we report initial results of a new effort to develop a standardized vocabulary and ontology for insect natural history data. In particular, we describe a new database of representative insect natural history data derived from multiple sources (but focused on data from specimens in biological collections), an analysis of the abstract conceptual areas required for a comprehensive ontology of insect natural history data, and a database of use cases and competency questions to guide the development of data systems for insect natural history data. We also discuss data modeling and technology-related challenges that must be overcome to implement robust integration of insect natural history data.


2006 ◽  
Vol 44 (1) ◽  
pp. 96-105 ◽  
Author(s):  
William Easterly

Jeffrey Sachs's new book (The End of Poverty: Economic Possibilities for Our Time, Penguin Press: New York, 2005) advocates a “Big Push” featuring large increases in aid to finance a package of complementary investments in order to end world poverty. These recommendations are remarkably similar to those first made in the 1950s and 1960s in development economics. Today, as then, the Big Push recommendation overlooks the unsolvable information and incentive problems facing any large-scale planning exercise. A more promising approach would be to design incentives for aid agents to implement interventions piecemeal whenever they deliver large benefits for the poor relative to costs.


Sign in / Sign up

Export Citation Format

Share Document