Uncertainty Reduction for Knowledge Discovery and Information Extraction on the World Wide Web

In the World Wide Web, users are an important information source for companies or institutions. People use the communication platforms of Web 2.0, for example Twitter, in order to express their sentiments of products, politics, society, or even private situations. In 2014, the Twitter users worldwide submitted 582 million messages (tweets) per day. To process the mass of Web 2.0's data (e.g. Twitter data) is a key functionality in modern IT landscapes of companies or institutions, because sentiments of users can be very valuable for the development of products, the enhancement of marketing strategies, or the prediction of political elections. This chapter's aim is to provide a framework for extracting, preprocessing, and analyzing customer sentiments in Twitter in all different areas.

Download Full-text

Evaluation of Topic Models as a Preprocessing Engine for the Knowledge Discovery in Twitter Datasets

Big Data ◽

10.4018/978-1-4666-9840-6.ch057 ◽

2016 ◽

pp. 1260-1276

Author(s):

Stefan Sommer ◽

Tom Miller ◽

Andreas Hilbert

Keyword(s):

World Wide Web ◽

Web 2.0 ◽

Knowledge Discovery ◽

World Wide ◽

Information Source ◽

Topic Models ◽

Marketing Strategies ◽

Twitter Data ◽

The World ◽

Twitter Users

In the World Wide Web, users are an important information source for companies or institutions. People use the communication platforms of Web 2.0, for example Twitter, in order to express their sentiments of products, politics, society, or even private situations. In 2014, the Twitter users worldwide submitted 582 million messages (tweets) per day. To process the mass of Web 2.0's data (e.g. Twitter data) is a key functionality in modern IT landscapes of companies or institutions, because sentiments of users can be very valuable for the development of products, the enhancement of marketing strategies, or the prediction of political elections. This chapter's aim is to provide a framework for extracting, preprocessing, and analyzing customer sentiments in Twitter in all different areas.

Download Full-text

Knowledge Discovery for Automatic Query Expansion on the World Wide Web

Lecture Notes in Computer Science - Advances in Conceptual Modeling ◽

10.1007/3-540-48054-4_27 ◽

1999 ◽

pp. 334-347 ◽

Cited By ~ 3

Author(s):

Mathias Géry ◽

M. Hatem Haddad

Keyword(s):

World Wide Web ◽

Knowledge Discovery ◽

Query Expansion ◽

World Wide ◽

The World

Download Full-text

ADPBC: Arabic Dependency Parsing Based Corpora for Information Extraction

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.01.04 ◽

2021 ◽

Vol 13 (1) ◽

pp. 54-61

Author(s):

Sally Mohamed ◽

◽

Mahmoud Hussien ◽

Hamdy M. Mousa

Keyword(s):

World Wide Web ◽

Information Extraction ◽

World Wide ◽

File Format ◽

Extraction Techniques ◽

Complex Morphology ◽

Dependency Relation ◽

The World ◽

Dependency Model ◽

The Web

There is a massive amount of different information and data in the World Wide Web, and the number of Arabic users and contents is widely increasing. Information extraction is an essential issue to access and sort the data on the web. In this regard, information extraction becomes a challenge, especially for languages, which have a complex morphology like Arabic. Consequently, the trend today is to build a new corpus that makes the information extraction easier and more precise. This paper presents Arabic linguistically analyzed corpus, including dependency relation. The collected data includes five fields; they are a sport, religious, weather, news and biomedical. The output is CoNLL universal lattice file format (CoNLL-UL). The corpus contains an index for the sentences and their linguistic meta-data to enable quick mining and search across the corpus. This corpus has seventeenth morphological annotations and eight features based on the identification of the textual structures help to recognize and understand the grammatical characteristics of the text and perform the dependency relation. The parsing and dependency process conducted by the universal dependency model and corrected manually. The results illustrated the enhancement in the dependency relation corpus. The designed Arabic corpus helps to quickly get linguistic annotations for a text and make the information Extraction techniques easy and clear to learn. The gotten results illustrated the average enhancement in the dependency relation corpus.

Download Full-text

An Integrated Approach for Knowledge discovery and Information retrieval on Web

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0015 ◽

2020 ◽

pp. 81-83

Keyword(s):

Data Mining ◽

World Wide Web ◽

Knowledge Discovery ◽

Web Mining ◽

World Wide ◽

Information Filtering ◽

Integrated Approach ◽

Web Data ◽

Web Information ◽

The World

This Paper focuses on the integration of web information and subsequent knowledge relationship discovery within the integrated web data. The problem of information overload on the Internet has brought new attention to the ideas of filtering information on internet. Knowledge Discovery is often used for analysis of large amounts of web data and enables addressing a number of tasks that arise in Semantic Web and require scalable solutions. The World Wide Web and related web Information resources no arguably stand as the best-preferred medium for distributing information. It introduces various approaches to knowledge relation discovery like model creation, exact comparison and dynamic comparison. The nature of the web and the mass of valuable web information it holds, poses an ideal stage for applying data mining techniques for efficient discovery of knowledge from the World Wide Web. The eagerness shown by various research communities has made web based data mining (Web Mining) a rich mixture of different technologies. Therefore the heterogeneity in the area of web mining is as high as web itself. Our objective is to design an approach for information filtering, a general approach to personalized information filtering. Social Information filtering essentially automates the process of “word-of-mouth” recommendations: items are recommended to a user based upon values assigned by other people with similar taste. The system determines which users have similar taste via standard formulas for computing statistical correlations. The World Wide Web (WWW) provides a vast source of information. Technique for making personalized recommendations from any type of database to a user based on similarities between the interest profile of that user and those of other users. Recent years have seen the explosive growth of the sheer volume of information.

Download Full-text