Knowledge Discovery From Massive Data Streams

Author(s):  
Sushil Kumar Narang ◽  
Sushil Kumar ◽  
Vishal Verma

T.S. Eliot once wrote some beautiful poetic lines including one “Where is the knowledge we have lost in information?”. Can't say that T.S. Eliot could have anticipated today's scenario which is emerging from his poetic lines. Data in present scenario is a profuse resource in many circumstances and is piling-up and many technical leaders are finding themselves drowning in data. Through this big stream of data there is a vast flood of information coming out and seemingly crossing manageable boundaries. As Information is a necessary channel for educing and constructing knowledge, one can assume the importance of generating new and comprehensive knowledge discovery tools and techniques for digging this overflowing sea of information to create explicit knowledge. This chapter describes traditional as well as modern research techniques towards knowledge discovery from massive data streams. These techniques have been effectively applied not exclusively to completely structured but also to semi-structured and unstructured data. At the same time Semantic Web technologies in today's perspective require many of them to deal with all sorts of raw data.

Author(s):  
Sushil Kumar Narang ◽  
Sushil Kumar ◽  
Vishal Verma

T.S. Eliot once wrote some beautiful poetic lines including one “Where is the knowledge we have lost in information?”. Can't say that T.S. Eliot could have anticipated today's scenario which is emerging from his poetic lines. Data in present scenario is a profuse resource in many circumstances and is piling-up and many technical leaders are finding themselves drowning in data. Through this big stream of data there is a vast flood of information coming out and seemingly crossing manageable boundaries. As Information is a necessary channel for educing and constructing knowledge, one can assume the importance of generating new and comprehensive knowledge discovery tools and techniques for digging this overflowing sea of information to create explicit knowledge. This chapter describes traditional as well as modern research techniques towards knowledge discovery from massive data streams. These techniques have been effectively applied not exclusively to completely structured but also to semi-structured and unstructured data. At the same time Semantic Web technologies in today's perspective require many of them to deal with all sorts of raw data.


Author(s):  
Jennifer Sampson ◽  
John Krogstie ◽  
Csaba Veres

Recently semantic web technologies, such as ontologies, have been proposed as key enablers for integrating heterogeneous data schemas in business and governmental systems. Algorithms designed to align different but related ontologies have become necessary as differing ontologies proliferate. The process of ontology alignment seeks to find corresponding entities in a second ontology with the same or the closest meaning for each entity in a single ontology. This research is motivated by the need to provide tools and techniques to support the task of validating ontology alignment statements, since it cannot be guaranteed that the results from automated tools are accurate. The authors present a framework for understanding ontology alignment quality and describe how AlViz, a tool for visual ontology alignment, may be used to improve the quality of alignment results. An experiment was undertaken to test the claim that AlViz supports the task of validating ontology alignments. A promising result found that the tool has potential for identifying missing alignments and for rejecting false alignments.


Author(s):  
Jennifer Sampson ◽  
John Krogstie ◽  
Csaba Veres

Recently semantic web technologies, such as ontologies, have been proposed as key enablers for integrating heterogeneous data schemas in business and governmental systems. Algorithms designed to align different but related ontologies have become necessary as differing ontologies proliferate. The process of ontology alignment seeks to find corresponding entities in a second ontology with the same or the closest meaning for each entity in a single ontology. This research is motivated by the need to provide tools and techniques to support the task of validating ontology alignment statements, since it cannot be guaranteed that the results from automated tools are accurate. The authors present a framework for understanding ontology alignment quality and describe how AlViz, a tool for visual ontology alignment, may be used to improve the quality of alignment results. An experiment was undertaken to test the claim that AlViz supports the task of validating ontology alignments. A promising result found that the tool has potential for identifying missing alignments and for rejecting false alignments.


2008 ◽  
Vol 22 (2) ◽  
pp. 249-277 ◽  
Author(s):  
Tod Sedbrook ◽  
Richard I. Newmark

ABSTRACT: Enterprise modelers require tools and techniques that consistently represent and logically apply domain knowledge. Current modeling approaches rely on entity relationship or unified modeling diagrams to represent semantic descriptions of business exchanges. However, it remains difficult to transform the implicit metadata, ontologies, and logic embedded in diagrams into a coherent form that can be interpreted by machines and delivered across the web. This study explores the uniting of machine processing capabilities of semantic web technologies with resource event agent (REA) enterprise ontologies to model complex multienterprise partnerships. Web Ontology Language (OWL) and Semantic Web Rule Language (SWRL) were used to model REA policies for a distributed e-commerce partnership selling nearly new vehicles. We combine a specialized REA application ontology with semantic technologies to direct multienterprise collaborations. We present a prototype that encodes the ontology's concepts within OWL and SWRL and explore these machine-readable representations within the context of a case study.


Informatica ◽  
2015 ◽  
Vol 26 (2) ◽  
pp. 221-240 ◽  
Author(s):  
Valentina Dagienė ◽  
Daina Gudonienė ◽  
Renata Burbaitė

Metabolites ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 48
Author(s):  
Marc Feuermann ◽  
Emmanuel Boutet ◽  
Anne Morgat ◽  
Kristian Axelsen ◽  
Parit Bansal ◽  
...  

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.


2019 ◽  
Vol 18 (03) ◽  
pp. 953-979 ◽  
Author(s):  
Lingling Zhang ◽  
Minghui Zhao ◽  
Zili Feng

In the era of big data, how to obtain useful knowledge from online news and utilize it as an important basis to make investment decision has become the hotspot of industrial and academic research. At present, there have been research and practice on explicit knowledge acquisition from news, but tacit knowledge acquisition is still under exploration. Based on the general mechanism of domain knowledge, knowledge reasoning, and knowledge discovery, this paper constructs a framework for discovering tacit knowledge from news and applying the knowledge to stock forecasting. The concrete work is as follows: First, according to the characteristics of financial field and the conceptual cube, the conceptual structure of industry–company–product is constructed, and the framework of domain ontology is put forward. Second, with the construction of financial field ontology, the financial news knowledge management framework is proposed. Besides, with the application of attributes in ontology and domain rules extracted from news text, the knowledge reasoning mechanism of financial news is constructed to achieve financial news knowledge discovery. Finally, news knowledge that reflects important information about stock changes is integrated into the traditional stock price forecasting model and the newly proposed model performs well in the empirical analysis of polyester industry.


2015 ◽  
Vol 22 (3) ◽  
pp. 99-104 ◽  
Author(s):  
Henryk Krawczyk ◽  
Michał Nykiel ◽  
Jerzy Proficz

Abstract The recently deployed supercomputer Tryton, located in the Academic Computer Center of Gdansk University of Technology, provides great means for massive parallel processing. Moreover, the status of the Center as one of the main network nodes in the PIONIER network enables the fast and reliable transfer of data produced by miscellaneous devices scattered in the area of the whole country. The typical examples of such data are streams containing radio-telescope and satellite observations. Their analysis, especially with real-time constraints, can be challenging and requires the usage of dedicated software components. We propose a solution for such parallel analysis using the supercomputer, supervised by the KASKADA platform, which with the conjunction with immerse 3D visualization techniques can be used to solve problems such as pulsar detection and chronometric or oil-spill simulation on the sea surface.


Sign in / Sign up

Export Citation Format

Share Document