Uncertainty Reduction for Knowledge Discovery and Information Extraction on the World Wide Web

2012 ◽  
Vol 100 (9) ◽  
pp. 2658-2674 ◽  
Author(s):  
Heng Ji ◽  
Hongbo Deng ◽  
Jiawei Han
Author(s):  
Jan Korst ◽  
Gijs Geleijnse ◽  
Nick de Jong ◽  
Michael Verschoor

Author(s):  
Stefan Sommer ◽  
Tom Miller ◽  
Andreas Hilbert

In the World Wide Web, users are an important information source for companies or institutions. People use the communication platforms of Web 2.0, for example Twitter, in order to express their sentiments of products, politics, society, or even private situations. In 2014, the Twitter users worldwide submitted 582 million messages (tweets) per day. To process the mass of Web 2.0's data (e.g. Twitter data) is a key functionality in modern IT landscapes of companies or institutions, because sentiments of users can be very valuable for the development of products, the enhancement of marketing strategies, or the prediction of political elections. This chapter's aim is to provide a framework for extracting, preprocessing, and analyzing customer sentiments in Twitter in all different areas.


Big Data ◽  
2016 ◽  
pp. 1260-1276
Author(s):  
Stefan Sommer ◽  
Tom Miller ◽  
Andreas Hilbert

In the World Wide Web, users are an important information source for companies or institutions. People use the communication platforms of Web 2.0, for example Twitter, in order to express their sentiments of products, politics, society, or even private situations. In 2014, the Twitter users worldwide submitted 582 million messages (tweets) per day. To process the mass of Web 2.0's data (e.g. Twitter data) is a key functionality in modern IT landscapes of companies or institutions, because sentiments of users can be very valuable for the development of products, the enhancement of marketing strategies, or the prediction of political elections. This chapter's aim is to provide a framework for extracting, preprocessing, and analyzing customer sentiments in Twitter in all different areas.


Author(s):  
Sally Mohamed ◽  
◽  
Mahmoud Hussien ◽  
Hamdy M. Mousa

There is a massive amount of different information and data in the World Wide Web, and the number of Arabic users and contents is widely increasing. Information extraction is an essential issue to access and sort the data on the web. In this regard, information extraction becomes a challenge, especially for languages, which have a complex morphology like Arabic. Consequently, the trend today is to build a new corpus that makes the information extraction easier and more precise. This paper presents Arabic linguistically analyzed corpus, including dependency relation. The collected data includes five fields; they are a sport, religious, weather, news and biomedical. The output is CoNLL universal lattice file format (CoNLL-UL). The corpus contains an index for the sentences and their linguistic meta-data to enable quick mining and search across the corpus. This corpus has seventeenth morphological annotations and eight features based on the identification of the textual structures help to recognize and understand the grammatical characteristics of the text and perform the dependency relation. The parsing and dependency process conducted by the universal dependency model and corrected manually. The results illustrated the enhancement in the dependency relation corpus. The designed Arabic corpus helps to quickly get linguistic annotations for a text and make the information Extraction techniques easy and clear to learn. The gotten results illustrated the average enhancement in the dependency relation corpus.


This Paper focuses on the integration of web information and subsequent knowledge relationship discovery within the integrated web data. The problem of information overload on the Internet has brought new attention to the ideas of filtering information on internet. Knowledge Discovery is often used for analysis of large amounts of web data and enables addressing a number of tasks that arise in Semantic Web and require scalable solutions. The World Wide Web and related web Information resources no arguably stand as the best-preferred medium for distributing information. It introduces various approaches to knowledge relation discovery like model creation, exact comparison and dynamic comparison. The nature of the web and the mass of valuable web information it holds, poses an ideal stage for applying data mining techniques for efficient discovery of knowledge from the World Wide Web. The eagerness shown by various research communities has made web based data mining (Web Mining) a rich mixture of different technologies. Therefore the heterogeneity in the area of web mining is as high as web itself. Our objective is to design an approach for information filtering, a general approach to personalized information filtering. Social Information filtering essentially automates the process of “word-of-mouth” recommendations: items are recommended to a user based upon values assigned by other people with similar taste. The system determines which users have similar taste via standard formulas for computing statistical correlations. The World Wide Web (WWW) provides a vast source of information. Technique for making personalized recommendations from any type of database to a user based on similarities between the interest profile of that user and those of other users. Recent years have seen the explosive growth of the sheer volume of information.


Sign in / Sign up

Export Citation Format

Share Document