scholarly journals LIMES: A Framework for Link Discovery on the Semantic Web

Author(s):  
Axel-Cyrille Ngonga Ngomo ◽  
Mohamed Ahmed Sherif ◽  
Kleanthi Georgala ◽  
Mofeed Mohamed Hassan ◽  
Kevin Dreßler ◽  
...  

AbstractThe Linked Data paradigm builds upon the backbone of distributed knowledge bases connected by typed links. The mere volume of current knowledge bases as well as their sheer number pose two major challenges when aiming to support the computation of links across and within them. The first is that tools for link discovery have to be time-efficient when they compute links. Secondly, these tools have to produce links of high quality to serve the applications built upon Linked Data well. Solutions to the second problem build upon efficient computational approaches developed to solve the first and combine these with dedicated machine learning techniques. The current version of the Limes framework is the product of seven years of research on these two challenges. A series of machine learning techniques and efficient computation approaches were developed and integrated into this framework to address the link discovery problem. The framework combines these diverse algorithms within a generic and extensible architecture. In this article, we give an overview of version 1.7.4 of the open-source release of the framework. In particular, we focus on an overview of the architecture of the framework, an intuition of its inner workings and a brief overview of the approaches it contains. Some descriptions of the applications within which the framework was used complete the paper. Our framework is open-source and available under a GNU license at https://github.com/dice-group/LIMES together with a user manual and a developer manual.

2021 ◽  
Vol 1804 (1) ◽  
pp. 012133
Author(s):  
Mahmood Shakir Hammoodi ◽  
Hasanain Ali Al Essa ◽  
Wial Abbas Hanon

Software maintainability is a vital quality aspect as per ISO standards. This has been a concern since decades and even today, it is of top priority. At present, majority of the software applications, particularly open source software are being developed using Object-Oriented methodologies. Researchers in the earlier past have used statistical techniques on metric data extracted from software to evaluate maintainability. Recently, machine learning models and algorithms are also being used in a majority of research works to predict maintainability. In this research, we performed an empirical case study on an open source software jfreechart by applying machine learning algorithms. The objective was to study the relationships between certain metrics and maintainability.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1421
Author(s):  
Haechan Park ◽  
Nakhoon Baek

With the growth of artificial intelligence and deep learning technology, we have many active research works to apply the related techniques in various fields. To test and apply the latest machine learning techniques in gaming, it will be very useful to have a light-weight game engine for quick prototyping. Our game engine is implemented in a cost-effective way, in comparison to well-known commercial proprietary game engines, by utilizing open source products. Due to its simple internal architecture, our game engine is especially beneficial for modifying and reviewing the new functions through quick and repetitive tests. In addition, the game engine has a DNN (deep neural network) module, with which the proposed game engine can apply deep learning techniques to the game features, through applying deep learning algorithms in real-time. Our DNN module uses a simple C++ function interface, rather than additional programming languages and/or scripts. This simplicity enables us to apply machine learning techniques more efficiently and casually to the game applications. We also found some technical issues during our development with open sources. These issues mostly occurred while integrating various open source products into a single game engine. We present details of these technical issues and our solutions.


Author(s):  
Yuan Zhao ◽  
Tieke He ◽  
Zhenyu Chen

It is typically a manual, time-consuming, and tedious task of assigning bug reports to individual developers. Although some machine learning techniques are adopted to alleviate this dilemma, they are mainly focused on the open source projects, which use traditional repositories such as Bugzilla to manage their bug reports. With the boom of the mobile Internet, some new requirements and methods of software testing are emerging, especially the crowdsourced testing. Unlike the traditional channels, whose bug reports are often heavyweight, which means their bug reports are standardized with detailed attribute localization, bug reports tend to be lightweight in the context of crowdsourced testing. To exploit the differences of the bug reports assignment in the new settings, a unified bug reports assignment framework is proposed in this paper. This framework is capable of handling both the traditional heavyweight bug reports and the lightweight ones by (i) first preprocessing the bug reports and feature selections, (ii) then tuning the parameters that indicate the ratios of choosing different methods to vectorize bug reports, (iii) and finally applying classification algorithms to assign bug reports. Extensive experiments are conducted on three datasets to evaluate the proposed framework. The results indicate the applicability of the proposed framework, and also reveal the differences of bug report assignment between traditional repositories and crowdsourced ones.


2014 ◽  
Vol 62 (3) ◽  
pp. 193-201 ◽  
Author(s):  
Fabio Ribeiro Cerqueira ◽  
Tiago Geraldo Ferreira ◽  
Alcione de Paiva Oliveira ◽  
Douglas Adriano Augusto ◽  
Eduardo Krempser ◽  
...  

2021 ◽  
Author(s):  
Pittawat Taveekitworachai ◽  
Jonathan H. Chan

The Krathu-500 contains 574 Pantip posts title, post body with all comments of each post. The number of total comments is at 63,293 comments. The corpus provide Thai language used in real life situation with various context and types in conversational form. The corpus serves as a good way to improve capability of machine learning techniques that dealing with Thai language. Sentiment labeled smaller version of the comments dataset also provided with 6,306 records. The labeled corpus is human-annotated dataset with three labels for negative, neutral, and positive comments. The project also consists of open-source repository that allow any people who interested to modify and built on top of the current source code and dataset.


2022 ◽  
Vol 80 (1) ◽  
Author(s):  
Romana Haneef ◽  
Mariken Tijhuis ◽  
Rodolphe Thiébaut ◽  
Ondřej Májek ◽  
Ivan Pristaš ◽  
...  

Abstract Background The capacity to use data linkage and artificial intelligence to estimate and predict health indicators varies across European countries. However, the estimation of health indicators from linked administrative data is challenging due to several reasons such as variability in data sources and data collection methods resulting in reduced interoperability at various levels and timeliness, availability of a large number of variables, lack of skills and capacity to link and analyze big data. The main objective of this study is to develop the methodological guidelines calculating population-based health indicators to guide European countries using linked data and/or machine learning (ML) techniques with new methods. Method We have performed the following step-wise approach systematically to develop the methodological guidelines: i. Scientific literature review, ii. Identification of inspiring examples from European countries, and iii. Developing the checklist of guidelines contents. Results We have developed the methodological guidelines, which provide a systematic approach for studies using linked data and/or ML-techniques to produce population-based health indicators. These guidelines include a detailed checklist of the following items: rationale and objective of the study (i.e., research question), study design, linked data sources, study population/sample size, study outcomes, data preparation, data analysis (i.e., statistical techniques, sensitivity analysis and potential issues during data analysis) and study limitations. Conclusions This is the first study to develop the methodological guidelines for studies focused on population health using linked data and/or machine learning techniques. These guidelines would support researchers to adopt and develop a systematic approach for high-quality research methods. There is a need for high-quality research methodologies using more linked data and ML-techniques to develop a structured cross-disciplinary approach for improving the population health information and thereby the population health.


Author(s):  
Sebastian Hellmann ◽  
Jens Lehmann ◽  
Sören Auer

The vision of the Semantic Web aims to make use of semantic representations on the largest possible scale - the Web. Large knowledge bases such as DBpedia, OpenCyc, and GovTrack are emerging and freely available as Linked Data and SPARQL endpoints. Exploring and analysing such knowledge bases is a significant hurdle for Semantic Web research and practice. As one possible direction for tackling this problem, the authors present an approach for obtaining complex class expressions from objects in knowledge bases by using Machine Learning techniques. The chapter describes in detail how to leverage existing techniques to achieve scalability on large knowledge bases available as SPARQL endpoints or Linked Data. The algorithms are made available in the open source DL-Learner project and this chapter presents several real-life scenarios in which they can be used by Semantic Web applications.


Sign in / Sign up

Export Citation Format

Share Document