scholarly journals Design of Link Evaluation Method to Improve Reliability based on Linked Open Big Data and Natural Language Processing

2018 ◽  
Vol 7 (3.33) ◽  
pp. 168
Author(s):  
Yonglak SHON ◽  
Jaeyoung PARK ◽  
Jangmook KANG ◽  
Sangwon LEE

The LOD data sets consist of RDF Triples based on the Ontology, a specification of existing facts, and by linking them to previously disclosed knowledge based on linked data principles. These structured LOD clouds form a large global data network, which provides a more accurate foundation for users to deliver the desired information. However, it is difficult to identify that, if the presence of the same object is identified differently across several LOD data sets, they are inherently identical. This is because objects with different URIs in the LOD datasets must be different and they must be closely examined for similarities in order to judge them as identical. The aim of this study is that the prosed model, RILE, evaluates similarity by comparing object values of existing specified predicates. After performing experiments with our model, we could check the improvement of the confidence level of the connection by extracting the link value.  

2020 ◽  
Vol 4 (1) ◽  
pp. 18-43
Author(s):  
Liuqing Li ◽  
Jack Geissinger ◽  
William A. Ingram ◽  
Edward A. Fox

AbstractNatural language processing (NLP) covers a large number of topics and tasks related to data and information management, leading to a complex and challenging teaching process. Meanwhile, problem-based learning is a teaching technique specifically designed to motivate students to learn efficiently, work collaboratively, and communicate effectively. With this aim, we developed a problem-based learning course for both undergraduate and graduate students to teach NLP. We provided student teams with big data sets, basic guidelines, cloud computing resources, and other aids to help different teams in summarizing two types of big collections: Web pages related to events, and electronic theses and dissertations (ETDs). Student teams then deployed different libraries, tools, methods, and algorithms to solve the task of big data text summarization. Summarization is an ideal problem to address learning NLP since it involves all levels of linguistics, as well as many of the tools and techniques used by NLP practitioners. The evaluation results showed that all teams generated coherent and readable summaries. Many summaries were of high quality and accurately described their corresponding events or ETD chapters, and the teams produced them along with NLP pipelines in a single semester. Further, both undergraduate and graduate students gave statistically significant positive feedback, relative to other courses in the Department of Computer Science. Accordingly, we encourage educators in the data and information management field to use our approach or similar methods in their teaching and hope that other researchers will also use our data sets and synergistic solutions to approach the new and challenging tasks we addressed.


2021 ◽  
Vol 50 (2-3) ◽  
pp. 17-22
Author(s):  
Johannes Brunzel

Der Beitrag erläutert, inwiefern die Methode der quantitativen Textanalyse ein wesentliches Mittel zur betriebswirtschaftlichen Effizienzsteigerung sein kann. Dabei geht der Artikel über die Nennung von Chancen und Risiken des Einsatzes von künstlicher Intelligenz/Big Data-Analysen hinaus, indem der Beitrag praxisorientiert wichtige Entwicklungen im Bereich der quantitativen Inhaltsanalyse aus der wirtschaftswissenschaftlichen Literatur herleitet. Nachfolgend unterteilt der Artikel die wichtigsten Schritte zur Implementierung in (1) Datenerhebung von quantitativen Textdaten, (2) Durchführung der generischen Textanalyse und (3) Durchführung des Natural Language Processing. Als ein Hauptergebnis hält der Artikel fest, dass Natural Language Processing-Ansätze zwar weiterführende und komplexere Einsichten bieten, jedoch das Potenzial generischer Textanalyse - aufgrund der Flexibilität und verhältnismäßig einfachen Anwendbarkeit im Unternehmenskontext - noch nicht ausgeschöpft ist. Zudem stehen Führungskräfte vor der dichotomen Entscheidung, ob programmierbasierte oder kommerzielle Lösungen für die Durchführung der Textanalyse relevant sind.


2020 ◽  
Vol 34 (05) ◽  
pp. 8504-8511
Author(s):  
Arindam Mitra ◽  
Ishan Shrivastava ◽  
Chitta Baral

Natural Language Inference (NLI) plays an important role in many natural language processing tasks such as question answering. However, existing NLI modules that are trained on existing NLI datasets have several drawbacks. For example, they do not capture the notion of entity and role well and often end up making mistakes such as “Peter signed a deal” can be inferred from “John signed a deal”. As part of this work, we have developed two datasets that help mitigate such issues and make the systems better at understanding the notion of “entities” and “roles”. After training the existing models on the new dataset we observe that the existing models do not perform well on one of the new benchmark. We then propose a modification to the “word-to-word” attention function which has been uniformly reused across several popular NLI architectures. The resulting models perform as well as their unmodified counterparts on the existing benchmarks and perform significantly well on the new benchmarks that emphasize “roles” and “entities”.


Author(s):  
Saravanakumar Kandasamy ◽  
Aswani Kumar Cherukuri

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.


Author(s):  
Kanza Noor Syeda ◽  
Syed Noorulhassan Shirazi ◽  
Syed Asad Ali Naqvi ◽  
Howard J Parkinson ◽  
Gary Bamford

Due to modern powerful computing and the explosion in data availability and advanced analytics, there should be opportunities to use a Big Data approach to proactively identify high risk scenarios on the railway. In this chapter, we comprehend the need for developing machine intelligence to identify heightened risk on the railway. In doing so, we have explained a potential for a new data driven approach in the railway, we then focus the rest of the chapter on Natural Language Processing (NLP) and its potential for analysing accident data. We review and analyse investigation reports of railway accidents in the UK, published by the Rail Accident Investigation Branch (RAIB), aiming to reveal the presence of entities which are informative of causes and failures such as human, technical and external. We give an overview of a framework based on NLP and machine learning to analyse the raw text from RAIB reports which would assist the risk and incident analysis experts to study causal relationship between causes and failures towards the overall safety in the rail industry.


Author(s):  
Azleena Mohd Kassim ◽  
Yu-N Cheah

Information Technology (IT) is often employed to put knowledge management policies into operation. However, many of these tools require human intervention when it comes to deciding how the knowledge is to be managed. The Sematic Web may be an answer to this issue, but many Sematic Web tools are not readily available for the regular IT user. Another problem that arises is that typical efforts to apply or reuse knowledge via a search mechanism do not necessarily link to other pages that are relevant. Blogging systems appear to address some of these challenges but the browsing experience can be further enhanced by providing links to other relevant posts. In this chapter, the authors present a semantic blogging tool called SEMblog to identify, organize, and reuse knowledge based on the Sematic Web and ontologies. The SEMblog methodology brings together technologies such as Natural Language Processing (NLP), Sematic Web representations, and the ubiquity of the blogging environment to produce a more intuitive way to manage knowledge, especially in the areas of knowledge identification, organization, and reuse. Based on detailed comparisons with other similar systems, the uniqueness of SEMblog lies in its ability to automatically generate keywords and semantic links.


2020 ◽  
pp. 168-187
Author(s):  
George A. Khachatryan

What are the relative merits of instruction modeling and other approaches to the design of blended learning programs? This chapter discusses several prevailing approaches, including applied learning science, personalization, and the use of big data in education. Many programs are designed around a single claimed feature of good instruction; terming such thinking “featurism,” this chapter argues that it is reductionist and less likely to be successful than more comprehensive approaches (such as instruction modeling). However, instruction modeling is not simply an alternative to other approaches: as the example of cognitive psychology illustrates, instruction modeling can often be fruitfully combined with other methods. Just as good software developers blend different approaches (e.g., using usability testing and the psychology of attention in designing interfaces), good instructional designers should draw on a wide range of techniques. This chapter discusses how instruction modeling can work in concert with big data, natural language processing, and other important approaches.


Sign in / Sign up

Export Citation Format

Share Document