scholarly journals From Textual Information Sources to Linked Data in the Agatha Project

Author(s):  
Paulo Quaresma ◽  
Vitor Beires Nogueira ◽  
Kashyap Raiyani ◽  
Roy Bayot ◽  
Teresa Gonçalves
2003 ◽  
Vol 125 (06) ◽  
pp. 44-46
Author(s):  
Jean Thilmany

This article reviews organizing information—after gathering it in the first place—is the key to actually using it. In this web-linked, data-rich world, engineers, like other professionals, must organize for easy access both the growing amount of information they create during the course of their day and the information they need to do their jobs. Indexing and linking documents and other information sources is an important first step, but engineers also must have a way to organize the information so they can find it via an easy search. Concept maps can be used to collect a designer’s knowledge and cognitive skills not available to other designers. The concept map explains through a formal step-by-step visual representation how an engineer attacked a design problem, struggled with it, attacked it from another angle, and eventually solved it. The maps are archived in a database and can be referred to for help with similar problems.


2018 ◽  
Vol 18 (1) ◽  
pp. 109-123 ◽  
Author(s):  
John P. McCrae ◽  
Paul Buitelaar

Abstract Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.


2018 ◽  
Vol 119 (9/10) ◽  
pp. 586-596 ◽  
Author(s):  
Nazia Wahid ◽  
Nosheen Fatima Warraich ◽  
Muzammil Tahira

Purpose This study aims to explore the development of cataloguing standards used to organize information sources in libraries and information centers. Its key objective is to assess the challenges faced by information professionals to apply new bibliographic standards in linked data (LD) environment. Design/methodology/approach This study is based on extensive review of scholarly literature. Several databases were searched to identify relevant literature. Keywords such as RDA, FRBR, MARC and BIBFRAME were used along with LD to conduct search. Related literature was consulted and reviewed accordingly. Findings Findings reveal that cataloguing standards are subsequently evolving with the advancement of information technology. Libraries have been publishing their legacy metadata into LD. Many tools are developed for mapping the library metadata into LD applications. The Library of Congress has developed BIBFRAME model to fulfill the requirements of new bibliographic standards by using LD technology. It is found that extensive use of MARC standards, complexity of LD technologies, non-availability of vocabulary and inconsistency of terminologies are the major challenges for libraries to adopt LD applications. Practical implications This review will be a valuable addition for LIS scholars to understand the challenges of LD application. This study would be significant for the library community and policymakers who are interested in implementing LD technologies. Originality/value This paper is a one of its kind, where the development in cataloguing models and standards is explained along with the challenges to adopt LD applications for legacy data.


2020 ◽  
Author(s):  
Shalin Shah

<p>Recommender systems aim to personalize the shopping experience of a user by suggesting related products, or products that are found to be in the general interests of the user. The information available for users and products is heterogenous, and many systems use one or some of the information. The information available include the user's interactions history with the products and categories, textual information of the products, a hierarchical classification of the products into a taxonomy, user interests based on a questionnaire, the demographics of a user, inferred interests based on product reviews given by a user, interests based on the physical location of a user and so on. Taxonomy discovery for personalized recommendation is work published in 2014 which uses the first three information sources { the user's interaction history, textual information of the products and optionally, an existing taxonomy of the products. In this paper, we describe a parallel implementation of this approach on Apache Spark and discuss the modifications to the algorithm in order to scale it to several hundreds of thousands of users with a large inventory of products at Target corporation. We run experiments on a sample of users and provide results including some sample recommendations generated by our parallel algorithm.</p>


Author(s):  
Khaoula Mahmoudi ◽  
Sami Faïz

Geographic Information Systems (GIS) (Faïz, 1999) are being increasingly used to manage, retrieve, and store large quantities of data which are tedious to handle manually. The GIS power is to help managers make critical decisions they face daily. The ability to make sound decisions relies upon the availability of relevant information. Typically, spatial databases do not contain much information that could support the decision making process in all situations. Besides, Jack Dangermond, president of a private GIS software company, argued that “The application of GIS is limited only by the imagination of those who use it”. Hence, it is of primary interest to provide other data sources to make these systems rich information sources.


Author(s):  
DMITRY ZELENKO ◽  
OLEG SEMIN

We present an automatic system that discovers competing companies from public information sources. The system extracts data from text, uses transformation-based learning to obtain appropriate data normalization, combines structured and unstructured information sources, uses probabilistic modelling to represent models of linked data, and succeeds in autonomously discovering competitors. We also introduce the iterative graph reconstruction process for inference in relational data, and show that it leads to improvements in performance. We validate system results and deploy it on the web as a powerful analytic tool for individual and institutional investors.


Information ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 49
Author(s):  
Gerasimos Razis ◽  
Georgios Theofilou ◽  
Ioannis Anagnostopoulos

The appearance of images in social messages is continuously increasing, along with user engagement with that type of content. Analysis of social images can provide valuable latent information, often not present in the social posts. In that direction, a framework is proposed exploiting latent information from Twitter images, by leveraging the Google Cloud Vision API platform, aiming at enriching social analytics with semantics and hidden textual information. As validated by our experiments, social analytics can be further enriched by considering the combination of user-generated content, latent concepts, and textual data extracted from social images, along with linked data. Moreover, we employed word embedding techniques for investigating the usage of latent semantic information towards the identification of similar Twitter images, thereby showcasing that hidden textual information can improve such information retrieval tasks. Finally, we offer an open enhanced version of the annotated dataset described in this study with the aim of further adoption by the research community.


Author(s):  
Adam Albert ◽  
Marie Duží ◽  
Marek Menšík ◽  
Miroslav Pajr ◽  
Vojtěch Patschka

In this paper, we deal with the support in the search for appropriate textual sources. Users ask for an atomic concept that is explicated using machine learning methods applied to different textual sources. Next, we deal with the so-obtained explications to provide even more useful information. To this end, we apply the method of computing association rules. The method is one of the data-mining methods used for information retrieval. Our background theory is the system of Transparent Intensional Logic (TIL); all the concepts are formalised as TIL constructions.


2020 ◽  
Author(s):  
Shalin Shah

<p>Recommender systems aim to personalize the shopping experience of a user by suggesting related products, or products that are found to be in the general interests of the user. The information available for users and products is heterogenous, and many systems use one or some of the information. The information available include the user's interactions history with the products and categories, textual information of the products, a hierarchical classification of the products into a taxonomy, user interests based on a questionnaire, the demographics of a user, inferred interests based on product reviews given by a user, interests based on the physical location of a user and so on. Taxonomy discovery for personalized recommendation is work published in 2014 which uses the first three information sources { the user's interaction history, textual information of the products and optionally, an existing taxonomy of the products. In this paper, we describe a parallel implementation of this approach on Apache Spark and discuss the modifications to the algorithm in order to scale it to several hundreds of thousands of users with a large inventory of products at Target corporation. We run experiments on a sample of users and provide results including some sample recommendations generated by our parallel algorithm.</p>


Sign in / Sign up

Export Citation Format

Share Document