The case for NLP-enhanced database tuning

The Semantic Web will require semantic representations of information that computers can understand when they process business applications. Most Web content is currently represented in formats such as text, that facilitate human understanding, rather than in the more structured formats, that allow automated processing and computer understanding. This chapter explores how natural language processing (NLP) principles, using linguistic analysis, can be employed to extract information from unstructured Web documents and translate it into extensible markup language (XML)—the enabling currency of today’s e-business applications, and the foundation for the emerging Semantic Web languages of tomorrow. Our prototype system is built and tested with online financial documents.

Download Full-text

Opinion Mining and Information Retrieval

Handbook of Research on Ambient Intelligence and Smart Environments - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-61692-857-5.ch030 ◽

2011 ◽

pp. 640-652

Author(s):

Shishir K. Shandilya ◽

Suresh Jain

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Ambient Intelligence ◽

Opinion Mining ◽

Training Data ◽

Machine Learning Techniques ◽

Web Documents ◽

Opinion Extraction ◽

Traditional Natural

The explosive increase in Internet usage has attracted technologies for automatically mining the user-generated contents (UGC) from Web documents. These UGC-rich resources have raised new opportunities and challenges to carry out the opinion extraction and mining tasks for opinion summaries. The technology of opinion extraction allows users to retrieve and analyze people’s opinions scattered over Web documents. Opinion mining is a process which is concerned with the opinions generated by the consumers about the product. Opinion Mining aims at understanding, extraction and classification of opinions scattered in unstructured text of online resources. The search engines performs well when one wants to know about any product before purchase, but the filtering and analysis of search results often complex and time-consuming. This generated the need of intelligent technologies which could process these unstructured online text documents through automatic classification, concept recognition, text summarization, etc. These tools are based on traditional natural language techniques, statistical analysis, and machine learning techniques. Automatic knowledge extraction over large text collections like Internet has been a challenging task due to many constraints such as needs of large annotated training data, requirement of extensive manual processing of data, and huge amount of domain-specific terms. Ambient Intelligence (AmI) in wed-enabled technologies supports and promotes the intelligent e-commerce services to enable the provision of personalized, self-configurable, and intuitive applications for facilitating UGC knowledge for buying confidence. In this chapter, we will discuss various approaches of Opinion Mining which combines Ambient Intelligence, Natural Language Processing and Machine Learning methods based on textual and grammatical clues.

Download Full-text

Need for Computational and Psycho-linguistics Models in Natural Language Processing for Web Documents

2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) ◽

10.1109/ismsit50672.2020.9254754 ◽

2020 ◽

Author(s):

Muhammad Raza Naqvi ◽

Syed Khuram Shahzad ◽

Muhammad Waseem Iqbal ◽

Muhammad Ahmed ◽

Muhammad Usman Tahir ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Web Documents

Download Full-text

Automatic classification of the emotional content of web documents

10.32920/ryerson.14653809.v1 ◽

2021 ◽

Author(s):

Alaa Hussainalsaid

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Automatic Classification ◽

Emotional Content ◽

Web Pages ◽

Web Documents ◽

N Gram

This thesis proposes automatic classification of the emotional content of web documents using Natural Language Processing (NLP) algorithms. We used online articles and general documents to verify the performance of the algorithm, such as general web pages and news articles. The experiments used sentiment analysis that extracts sentiment of web documents. We used unigram and bigram approaches that are known as special types of N-gram, where N=1 and N=2, respectively. The unigram model analyses the probability to hit each word in the corpus independently; however, the bigram model analyses the probability of a word occurring depending on the previous word. Our results show that the unigram model has a better performance compared to the bigram model in terms of automatic classification of the emotional content of web documents.

Download Full-text

Automatic classification of the emotional content of web documents

10.32920/ryerson.14653809 ◽

2021 ◽

Author(s):

Alaa Hussainalsaid

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Automatic Classification ◽

Emotional Content ◽

Web Pages ◽

Web Documents ◽

N Gram

This thesis proposes automatic classification of the emotional content of web documents using Natural Language Processing (NLP) algorithms. We used online articles and general documents to verify the performance of the algorithm, such as general web pages and news articles. The experiments used sentiment analysis that extracts sentiment of web documents. We used unigram and bigram approaches that are known as special types of N-gram, where N=1 and N=2, respectively. The unigram model analyses the probability to hit each word in the corpus independently; however, the bigram model analyses the probability of a word occurring depending on the previous word. Our results show that the unigram model has a better performance compared to the bigram model in terms of automatic classification of the emotional content of web documents.

Download Full-text

Indonesian Information Extraction : Challenges and Opportunities

JATISI (Jurnal Teknik Informatika dan Sistem Informasi) ◽

10.35957/jatisi.v8i1.710 ◽

2021 ◽

Vol 8 (1) ◽

pp. 421-429

Author(s):

Yan Puspitarani

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Research Trends ◽

Structured Data ◽

Daily Lives ◽

Unstructured Text ◽

Challenges And Opportunities ◽

Data Source

Information extraction is part of natural language processing, aiming to find, retrieve, or process information. The data source for information extraction is text. Text cannot be separated from people's daily lives. Through text, a lot of confidential information can be obtained. To produce information, the unstructured text will be converted into structured data. There are many approaches that researchers take to this process. Most of the studies are in English. Therefore, this paper will present current research trends, challenges, and information extraction opportunities using Indonesian.

Download Full-text

NLP TOKEN MATCHING ON DATABASE USING BINARY SEARCH

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v3i1c.2766 ◽

2012 ◽

Vol 3 (1) ◽

pp. 140-143

Author(s):

Ekta Aggarwal ◽

Shreeja Nair

Keyword(s):

Natural Language ◽

Language Processing ◽

Time Complexity ◽

Binary Search ◽

Translation Process ◽

Query Translation ◽

Natural Language Text ◽

Natural Language Interface ◽

Reduced Time ◽

Language Text

Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. The paper deals with the concept of database where by the data resources data can be fetched and accessed accordingly with reduced time complexity. The retrieval techniques are pointed out based on the ideas of binary search. A natural language interface refers to words in its own dictionary as well as to the words in the standard dictionary, in order to interpret a query. The main contribution of this investigation is addressing the problem of improving the accuracy of the query translation process by using the information provided by the database schema.Â Â

Download Full-text

Sentence Extraction for Machine Comprehension

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3095.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 5511-5514

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Research Area ◽

Performance Comparison ◽

Word Count ◽

Natural Language Text ◽

Sentence Extraction ◽

The Given ◽

Language Text

Machine comprehension is a broad research area from Natural Language Processing domain, which deals with making a computerised system understand the given natural language text. Question answering system is one such variant used to find the correct ‘answer’ for a ‘query’ using the supplied ‘context’. Using a sentence instead of the whole context paragraph to determine the ‘answer’ is quite useful in terms of computation as well as accuracy. Sentence selection can, therefore, be considered as a first step to get the answer. This work devises a method for sentence selection that uses cosine similarity and common word count between each sentence of context and question. This removes the extensive training overhead associated with other available approaches, while still giving comparable results. The SQuAD dataset is used for accuracy based performance comparison.

Download Full-text

Automatic Extraction of Causally Related Functions From Natural-Language Text for Biomimetic Design

Volume 7: 9th International Conference on Design Education; 24th International Conference on Design Theory and Methodology ◽

10.1115/detc2012-70732 ◽

2012 ◽

Cited By ~ 10

Author(s):

Hyunmin Cheong ◽

L. H. Shu

Keyword(s):

Natural Language ◽

Language Processing ◽

Biological Information ◽

Research Approach ◽

Initial Development ◽

Biomimetic Design ◽

Natural Language Text ◽

Part Of Speech ◽

Extraction Algorithm ◽

Processing Techniques

Identifying relevant analogies from biology is a significant challenge in biomimetic design. Our natural-language approach addresses this challenge by developing techniques to search biological information in natural-language format, such as books or papers. This paper presents the application of natural-language processing techniques, such as part-of-speech tags, typed-dependency parsing, and syntactic patterns, to automatically extract and categorize causally related functions from text with biological information. Causally related functions, which specify how one action is enabled by another action, are considered important for both knowledge representation used to model biological information and analogical transfer of biological information performed by designers. An extraction algorithm was developed and scored F-measures of 0.78–0.85 in an initial development test. Because this research approach uses inexpensive and domain-independent techniques, the extraction algorithm has the potential to automatically identify patterns of causally related functions from a large amount of text that contains either biological or design information.

Download Full-text

Chinese Natural Language Oriented Dynamic Knowledge Extraction: A Case Study of Intelligent Design

Volume 2B: 27th Design Automation Conference ◽

10.1115/detc2001/dac-21147 ◽

2001 ◽

Author(s):

Renbin Xiao ◽

Ming Chang ◽

Hongbin Zhan ◽

Mu Su

Keyword(s):

Natural Language ◽

Intelligent Systems ◽

Intelligent Design ◽

Knowledge Extraction ◽

Prototype System ◽

Practical Case ◽

Natural Language Text ◽

Sentence Clustering

Abstract In view of the existed problems of knowledge acquisition in intelligent systems, a dynamic knowledge extraction method based on Chinese natural language sentence-clustering is put forward, and the corresponding software prototype system is implemented. First of all, the proposed method is introduced in the paper by the way to give its outline. In order to demonstrate an important role of the proposed method, we make a complete case study via the intelligent design of certain machine tool. The design background of such a product is presented and the implementation steps is given in detail to show the whole design process. Through the practical case, we have succeeded in extracting knowledge from natural language text and the effectiveness of the proposed method is verified.

Download Full-text