Design of Relation Extraction Framework to develop Knowledge Base

Web documents display information in the form of natural language text which is not understandable by machines. To search specific information from sea of web documents has become very challenging as it shows many unwanted non relevant documents along with relevant documents. To retrieve relevant information semantic knowledge can be stored in the domain specific ontology which helps in understanding user’s need to retrieve relevant information. Intensive research has been going on in the field of text processing to develop ontologies using NLP technique. The proposed technique is another effort in this direction. In this method to extract syntactic structure we have used Stanford parser which complete tokenization of text, parsing as well as morphological analysis. Semantic rules are defined manually to identify valid concepts and relation among them. Once concepts, properties and relationship among concepts are identified, extracted information is visualized in the form of ontology.

Download Full-text

CDM: AN APPROACH TO LEARNING IN TEXT CATEGORIZATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821309600016x ◽

1996 ◽

Vol 05 (01n02) ◽

pp. 229-253 ◽

Cited By ~ 8

Author(s):

JEFFREY L. GOLDBERG

Keyword(s):

Machine Learning ◽

Text Categorization ◽

Machine Learning Algorithms ◽

Skewed Distribution ◽

Approach To Learning ◽

Natural Language Text ◽

Domain Specific ◽

Category Discrimination ◽

Language Text ◽

Discrimination Method

The Category Discrimination Method (CDM) is a new machine learning algo rithm designed specifically for text categorization. The motivation is there are sta tistical problems associated with natural language text when it is applied as input to existing machine learning algorithms (too much noise, too many features, skewed distribution). The bases of the CDM are research results about the way that humans learn categories and concepts vis-à-vis contrasting concepts. The essential formula is cue validity borrowed from cognitive psychology, and used to select from all possible single word-based features the best predictors of a, given category. The, hypothesis that CDM’s performance. will exceed two non-domain specific al gorithms, Bayesian classification and decision tree learners, is empirically tested.

Download Full-text

Semantic-Based Indexing Approaches for Medical Document Clustering Using Cognitive Search

Advances in Social Networking and Online Communities - Cognitive Social Mining Applications in Data Analytics and Forensics ◽

10.4018/978-1-5225-7522-1.ch003 ◽

2019 ◽

pp. 41-64

Author(s):

Logeswari Shanmugam ◽

Premalatha K.

Keyword(s):

Text Processing ◽

Relevant Information ◽

Biomedical Literature ◽

Data Sets ◽

Original Text ◽

Biomedical Knowledge ◽

Text Data ◽

Natural Language Text ◽

Diverse Data ◽

Medical Document

Biomedical literature is the primary repository of biomedical knowledge in which PubMed is the most absolute database for collecting, organizing and analyzing textual knowledge. The high dimensionality of the natural language text makes the text data quite noisy and sparse in the vector space. Hence, the data preprocessing and feature selection are important processes for the text processing issues. Ontologies select the meaningful terms semantically associated with the concepts from a document to reduce the dimensionality of the original text. In this chapter, semantic-based indexing approaches are proposed with cognitive search which makes use of domain ontology to extract relevant information from big and diverse data sets for users.

Download Full-text

Dependency Tree Kernels for Relation Extraction from Natural Language Text

Machine Learning and Knowledge Discovery in Databases - Lecture Notes in Computer Science ◽

10.1007/978-3-642-04174-7_18 ◽

2009 ◽

pp. 270-285 ◽

Cited By ~ 9

Author(s):

Frank Reichartz ◽

Hannes Korte ◽

Gerhard Paass

Keyword(s):

Natural Language ◽

Relation Extraction ◽

Natural Language Text ◽

Dependency Tree ◽

Language Text

Download Full-text

Domain specific query generation from natural language text

2016 Sixth International Conference on Innovative Computing Technology (INTECH) ◽

10.1109/intech.2016.7845105 ◽

2016 ◽

Cited By ~ 4

Author(s):

Anum Iftikhar ◽

Erum Iftikhar ◽

Muhammad Khalid Mehmood

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Domain Specific ◽

Query Generation ◽

Language Text

Download Full-text

Natural language text processing and the maximal join operator

Lecture Notes in Computer Science - Conceptual Structures: Knowledge Representation as Interlingua ◽

10.1007/3-540-61534-2_6 ◽

1996 ◽

pp. 100-114

Author(s):

Heike Petermann

Keyword(s):

Natural Language ◽

Text Processing ◽

Natural Language Text ◽

Language Text

Download Full-text

An automated domain specific stop word generation method for natural language text classification

2011 International Symposium on Innovations in Intelligent Systems and Applications ◽

10.1109/inista.2011.5946149 ◽

2011 ◽

Cited By ~ 6

Author(s):

H. Ayral ◽

S. Yavuz

Keyword(s):

Natural Language ◽

Text Classification ◽

Word Generation ◽

Natural Language Text ◽

Domain Specific ◽

Stop Word ◽

Language Text

Download Full-text

A Semantic Knowledge-Based Framework for Information Extraction and Exploration

International Journal of Decision Support System Technology ◽

10.4018/ijdsst.2021040105 ◽

2021 ◽

Vol 13 (2) ◽

pp. 85-109

Author(s):

Abduladem Aljamel ◽

Taha Osman ◽

Dhavalkumar Thakker

Keyword(s):

Open Data ◽

Semantic Knowledge ◽

Unstructured Data ◽

Specific Information ◽

Use Case ◽

Binary Relations ◽

Semantic Framework ◽

Domain Specific ◽

Knowledge Based ◽

Financial Domain

The availability of online documents that describe domain-specific information provides an opportunity in employing a knowledge-based approach in extracting information from web data. This research proposes a novel comprehensive semantic knowledge-based framework that helps to transform unstructured data to be easily exploited by data scientists. The resultant sematic knowledgebase is reasoned to infer new facts and classify events that might be of importance to end users. The target use case for the framework implementation was the financial domain, which represents an important class of dynamic applications that require the modelling of non-binary relations. Such complex relations are becoming increasingly common in the era of linked open data. This research in modelling and reasoning upon such relations is a further contribution of the proposed semantic framework, where non-binary relations are semantically modelled by adapting the semantic reasoning axioms to fit the intermediate resources in the N-ary relations requirements.

Download Full-text

A common architecture to encourage reuse of natural language/text processing tools

Proceedings of 8th Knowledge-Based Software Engineering Conference ◽

10.1109/kbse.1993.341193 ◽

2002 ◽

Cited By ~ 1

Author(s):

T. MacMillan ◽

E. Lusher ◽

M. Farinacci ◽

S. Laskowski ◽

L. Seligman ◽

...

Keyword(s):

Natural Language ◽

Text Processing ◽

Natural Language Text ◽

Common Architecture ◽

Language Text

Download Full-text

Sentiment Analysis Using Common-Sense and Context Information

Computational Intelligence and Neuroscience ◽

10.1155/2015/715730 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 65

Author(s):

Basant Agarwal ◽

Namita Mittal ◽

Pooja Bansal ◽

Sonal Garg

Keyword(s):

Sentiment Analysis ◽

Common Sense ◽

Research Community ◽

Context Information ◽

Analysis Model ◽

Natural Language Text ◽

Domain Specific ◽

Wide Range ◽

Social Applications ◽

Language Text

Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods.

Download Full-text

Exploiting Parallel News Streams for Unsupervised Event Extraction

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00127 ◽

2015 ◽

Vol 3 ◽

pp. 117-129 ◽

Cited By ~ 2

Author(s):

Congle Zhang ◽

Stephen Soderland ◽

Daniel S. Weld

Keyword(s):

Graphical Model ◽

Relation Extraction ◽

Event Extraction ◽

Training Data ◽

Natural Language Text ◽

Distant Supervision ◽

Wide Range ◽

Precision Recall Curve ◽

Language Text ◽

Better Than

Most approaches to relation extraction, the task of extracting ground facts from natural language text, are based on machine learning and thus starved by scarce training data. Manual annotation is too expensive to scale to a comprehensive set of relations. Distant supervision, which automatically creates training data, only works with relations that already populate a knowledge base (KB). Unfortunately, KBs such as FreeBase rarely cover event relations ( e.g. “person travels to location”). Thus, the problem of extracting a wide range of events — e.g., from news streams — is an important, open challenge. This paper introduces NewsSpike-RE, a novel, unsupervised algorithm that discovers event relations and then learns to extract them. NewsSpike-RE uses a novel probabilistic graphical model to cluster sentences describing similar events from parallel news streams. These clusters then comprise training data for the extractor. Our evaluation shows that NewsSpike-RE generates high quality training sentences and learns extractors that perform much better than rival approaches, more than doubling the area under a precision-recall curve compared to Universal Schemas.

Download Full-text