An Approach for Ontology Based Information Extraction

2014 ◽  
Vol 12 (1) ◽  
pp. 15-20 ◽  
Author(s):  
N. Borisova

Abstract An approach for Ontology based Information Extraction (OBIE) from unstructured text in the Bulgarian language is presented in this paper. The presented method and algorithm provide a solution for automatic data extraction from text documents exploiting ontologies. To this end, in addition to the standard tools for processing language resources in an open source free software, a dictionary-based lemmatizer for Bulgarian has been developed and integrated. It is distributed as free software, publicly available to download and use under the GPL v3 license. Due to the specifics of inflection in Bulgarian the developed tools for lemmatization will contribute to improving the results of the POS tagger. This approach will offer opportunities for developing a dynamically created gazetteer that is, in combination with a few other generic GATE resources, capable of producing ontologybased annotations over the given content with regards to the given ontology. This algorithm can also be used in the processes of content creation and management of information and knowledge.

2013 ◽  
Vol 7 (2) ◽  
pp. 574-579 ◽  
Author(s):  
Dr Sunitha Abburu ◽  
G. Suresh Babu

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.  But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies  data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.   It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.  The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.


Author(s):  
Janis M Nolde ◽  
Ajmal Mian ◽  
Luca Schlaich ◽  
Justine Chan ◽  
Leslie Marisol Lugo-Gavidia ◽  
...  

Author(s):  
Sumathi S. ◽  
Rajkumar S. ◽  
Indumathi S.

Lease abstraction is the method of compartmentalization of key data from a lease document. Lease document for a property contains key business, money, and legal data about a property. A lease abstract report contains details concerning the property location and basic lease details, price schedules, key events, terms and conditions, automobile parking arrangements, and landowner and tenant obligations. Abstracting a true estate contract into electronic type facilitates easy access to key data, exchanging the tedious method of reading the whole contents of the contract every time. Language process may be used for data extraction and abstraction of knowledge from lease documents.


Author(s):  
Min Song ◽  
Il-Yeol Song ◽  
Xiaohua Hu ◽  
Hyoil Han

Information extraction (IE) technology has been defined and developed through the US DARPA Message Understanding Conferences (MUCs). IE refers to the identification of instances of particular events and relationships from unstructured natural language text documents into a structured representation or relational table in databases. It has proved successful at extracting information from various domains, such as the Latin American terrorism, to identify patterns related to terrorist activities (MUC-4). Another domain, in the light of exploiting the wealth of natural language documents, is to extract the knowledge or information from these unstructured plain-text files into a structured or relational form. This form is suitable for sophisticated query processing, for integration with relational databases, and for data mining. Thus, IE is a crucial step for fully making text files more easily accessible.


Author(s):  
Mirel Cosulschi ◽  
Adrian Giurca ◽  
Bogdan Udrescu ◽  
Nicolae Constantinescu ◽  
Mihai Gabroveanu

Sign in / Sign up

Export Citation Format

Share Document