Ontology-Based Information Extraction from the World Wide Web

Author(s):  
Jan Korst ◽  
Gijs Geleijnse ◽  
Nick de Jong ◽  
Michael Verschoor
Author(s):  
Sally Mohamed ◽  
◽  
Mahmoud Hussien ◽  
Hamdy M. Mousa

There is a massive amount of different information and data in the World Wide Web, and the number of Arabic users and contents is widely increasing. Information extraction is an essential issue to access and sort the data on the web. In this regard, information extraction becomes a challenge, especially for languages, which have a complex morphology like Arabic. Consequently, the trend today is to build a new corpus that makes the information extraction easier and more precise. This paper presents Arabic linguistically analyzed corpus, including dependency relation. The collected data includes five fields; they are a sport, religious, weather, news and biomedical. The output is CoNLL universal lattice file format (CoNLL-UL). The corpus contains an index for the sentences and their linguistic meta-data to enable quick mining and search across the corpus. This corpus has seventeenth morphological annotations and eight features based on the identification of the textual structures help to recognize and understand the grammatical characteristics of the text and perform the dependency relation. The parsing and dependency process conducted by the universal dependency model and corrected manually. The results illustrated the enhancement in the dependency relation corpus. The designed Arabic corpus helps to quickly get linguistic annotations for a text and make the information Extraction techniques easy and clear to learn. The gotten results illustrated the average enhancement in the dependency relation corpus.


2009 ◽  
Author(s):  
Blair Williams Cronin ◽  
Ty Tedmon-Jones ◽  
Lora Wilson Mau

2019 ◽  
pp. 3-6
Author(s):  
D. A. Bogdanova

The article provides an overview of the activities of the European Union Forum on kids' safety in Internet — Safer Internet Forum (SIF) 2019, which was held in Brussels, Belgium, in November 2019. The current Internet risks addressed by the World Wide Web users, especially children, are described.


Sign in / Sign up

Export Citation Format

Share Document