scholarly journals ScholarLens: extracting competences from research publications for the automatic generation of semantic user profiles

2017 ◽  
Vol 3 ◽  
pp. e121 ◽  
Author(s):  
Bahar Sateli ◽  
Felicitas Löffler ◽  
Birgitta König-Ries ◽  
René Witte

Motivation Scientists increasingly rely on intelligent information systems to help them in their daily tasks, in particular for managing research objects, like publications or datasets. The relatively young research field of Semantic Publishing has been addressing the question how scientific applications can be improved through semantically rich representations of research objects, in order to facilitate their discovery and re-use. To complement the efforts in this area, we propose an automatic workflow to construct semantic user profiles of scholars, so that scholarly applications, like digital libraries or data repositories, can better understand their users’ interests, tasks, and competences, by incorporating these user profiles in their design. To make the user profiles sharable across applications, we propose to build them based on standard semantic web technologies, in particular the Resource Description Framework (RDF) for representing user profiles and Linked Open Data (LOD) sources for representing competence topics. To avoid the cold start problem, we suggest to automatically populate these profiles by analyzing the publications (co-)authored by users, which we hypothesize reflect their research competences. Results We developed a novel approach, ScholarLens, which can automatically generate semantic user profiles for authors of scholarly literature. For modeling the competences of scholarly users and groups, we surveyed a number of existing linked open data vocabularies. In accordance with the LOD best practices, we propose an RDF Schema (RDFS) based model for competence records that reuses existing vocabularies where appropriate. To automate the creation of semantic user profiles, we developed a complete, automated workflow that can generate semantic user profiles by analyzing full-text research articles through various natural language processing (NLP) techniques. In our method, we start by processing a set of research articles for a given user. Competences are derived by text mining the articles, including syntactic, semantic, and LOD entity linking steps. We then populate a knowledge base in RDF format with user profiles containing the extracted competences.We implemented our approach as an open source library and evaluated our system through two user studies, resulting in mean average precision (MAP) of up to 95%. As part of the evaluation, we also analyze the impact of semantic zoning of research articles on the accuracy of the resulting profiles. Finally, we demonstrate how these semantic user profiles can be applied in a number of use cases, including article ranking for personalized search and finding scientists competent in a topic —e.g., to find reviewers for a paper. Availability All software and datasets presented in this paper are available under open source licenses in the supplements and documented at http://www.semanticsoftware.info/semantic-user-profiling-peerj-2016-supplements. Additionally, development releases of ScholarLens are available on our GitHub page: https://github.com/SemanticSoftwareLab/ScholarLens.

Author(s):  
Mayank Yuvaraj

The paper discusses the implementation of the ‘CUB E-journal One Search' tool as an alternative solution to commercial discovery services, which was designed using Google Custom Search by the Central Library, Central University of Bihar and its impact on the library users. A descriptive survey method was used for the study. The present study found that library users found CUB E-journal One Search as a useful tool to get their desired information out of 9000 subscribed e-resources in the university. Most of the users used CUB E-journal One Search frequently in order to find relevant articles, write their assignments and research articles. The study indicated that the library users were influenced by Google like single search boxes and wished to have same features. Further, users expected features like document recommendation, search filters, RSS and on-screen help from the discovery tool. The paper is a first attempt to study the impact of open source discovery tools on the library users. It will further give confidence to the librarians in developing countries to deploy open source search solutions using Google Custom Search in the libraries.


2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Sarah Williams

Objectives: This small-scale study explores the current state of connections between open data and open access (OA) articles in the life sciences. Methods: This study involved 44 openly available life sciences datasets from the Illinois Data Bank that had 45 related research articles. For each article, I gathered the OA status of the journal and the article on the publisher website and checked whether the article was openly available via Unpaywall and Research Gate. I also examined how and where the open data was included in the HTML and PDF versions of the related articles. Results: Of the 45 articles studied, less than half were published in Gold/Full OA journals, and while the remaining articles were published in Gold/Hybrid journals, none of them were OA. This study found that OA articles pointed to the Illinois Data Bank datasets similarly to all of the related articles, most commonly with a data availability statement containing a DOI. Conclusions: The findings indicate that Gold OA in hybrid journals does not appear to be a popular option, even for articles connected to open data, and this study emphasizes the importance of data repositories providing DOIs, since the related articles frequently used DOIs to point to the Illinois Data Bank datasets. This study also revealed concerns about free (not licensed OA) access to articles on publisher websites, which will be a significant topic for future research.


Author(s):  
Mayank Yuvaraj

The paper discusses the implementation of the ‘CUB E-journal One Search' tool as an alternative solution to commercial discovery services, which was designed using Google Custom Search by the Central Library, Central University of Bihar and its impact on the library users. A descriptive survey method was used for the study. The present study found that library users found CUB E-journal One Search as a useful tool to get their desired information out of 9000 subscribed e-resources in the university. Most of the users used CUB E-journal One Search frequently in order to find relevant articles, write their assignments and research articles. The study indicated that the library users were influenced by Google like single search boxes and wished to have same features. Further, users expected features like document recommendation, search filters, RSS and on-screen help from the discovery tool. The paper is a first attempt to study the impact of open source discovery tools on the library users. It will further give confidence to the librarians in developing countries to deploy open source search solutions using Google Custom Search in the libraries.


Vision ◽  
2019 ◽  
Vol 3 (4) ◽  
pp. 55
Author(s):  
Kar ◽  
Corcoran

In this paper, a range of open-source tools, datasets, and software that have been developed for quantitative and in-depth evaluation of eye gaze data quality are presented. Eye tracking systems in contemporary vision research and applications face major challenges due to variable operating conditions such as user distance, head pose, and movements of the eye tracker platform. However, there is a lack of open-source tools and datasets that could be used for quantitatively evaluating an eye tracker’s data quality, comparing performance of multiple trackers, or studying the impact of various operating conditions on a tracker’s accuracy. To address these issues, an open-source code repository named GazeVisual-Lib is developed that contains a number of algorithms, visualizations, and software tools for detailed and quantitative analysis of an eye tracker’s performance and data quality. In addition, a new labelled eye gaze dataset that is collected from multiple user platforms and operating conditions is presented in an open data repository for benchmark comparison of gaze data from different eye tracking systems. The paper presents the concept, development, and organization of these two repositories that are envisioned to improve the performance analysis and reliability of eye tracking systems.


2020 ◽  
Author(s):  
Geoff Boeing

Cities worldwide exhibit a variety of street network patterns and configurations that shape human mobility, equity, health, and livelihoods. This study models and analyzes the street networks of each urban area in the world, using boundaries derived from the Global Human Settlement Layer. Street network data are acquired and modeled from OpenStreetMap with the open-source OSMnx software. In total, this study models over 160 million OpenStreetMap street network nodes and over 320 million edges across 8,914 urban areas in 178 countries, and attaches elevation and grade data. This article presents the study's reproducible computational workflow, introduces two new open data repositories of ready-to-use global street network models and calculated indicators, and discusses summary findings on street network form worldwide. It makes four contributions. First, it reports the methodological advances of this open-source workflow. Second, it produces an open data repository containing street network models for each urban area. Third, it analyzes these models to produce an open data repository containing street network form indicators for each urban area. No such global urban street network indicator dataset has previously existed. Fourth, it presents a summary analysis of urban street network form, reporting the first such worldwide results in the literature.


2020 ◽  
Vol 9 (10) ◽  
pp. 591
Author(s):  
Jan Pavlík ◽  
Markéta Hrnčírová ◽  
Michal Stočes ◽  
Jan Masner ◽  
Jiří Vaněk

Recently, the process of data opening has intensified, especially thanks to the involvement of many institutions that have not yet shared their data. Some entities provided data to the public long before the trend of open data was pushed to a wider level, but many institutions have only engaged in this process recently thanks to a systemic state-level effort to make data repositories available to the public. Therefore, there are many new potential sources of data available for research, including the area of water management. This article analyses the current state of available data in the Czech Republic—their content, structure, format, availability, costs and other indicators that affect the usability of these data for independent researchers in the area of water management. The case study was conducted to ascertain the levels of accessibility and usability of data in open data repositories and the possibilities of obtaining data from IoT (Internet of Things) devices such as networked sensors where required data is either not available from existing sources, too costly, or otherwise unsuitable for the research. The goal of the underlying research was to assess the impact/ratio of various watershed factors based on monitored indicators of water pollution in a model watershed. Such information would help propose measures for reducing the volume of pollution resulting in increased security in terms of available drinking water for the capital city Prague.


Sign in / Sign up

Export Citation Format

Share Document