A Keyphrase-Based Approach to Text Summarization for English and Bengali Documents

Kamal Sarkar

doi:10.4018/ijtd.2014040103

A Keyphrase-Based Approach to Text Summarization for English and Bengali Documents

International Journal of Technology Diffusion ◽

10.4018/ijtd.2014040103 ◽

2014 ◽

Vol 5 (2) ◽

pp. 28-38 ◽

Cited By ~ 7

Author(s):

Kamal Sarkar

Keyword(s):

World Wide Web ◽

Rapid Growth ◽

World Wide ◽

Information Overload ◽

State Of The Art ◽

Text Summarization ◽

Text Document ◽

Web Information ◽

The World ◽

The Given

With the rapid growth of the World Wide Web, information overload is becoming a problem for an increasingly large number of people. Since summarization helps human to digest the main contents of a text document very rapidly, there is a need for an effective and powerful tool that can automatically summarize text. In this paper, we present a keyphrase based approach to single document summarization that extracts first a set of keyphrases from a document, use the extracted keyphrases to choose sentences from the document and finally form an extractive summary with the chosen sentences. We view keyphrases (single or multi-word) as the important concepts and we assume that an extractive summary of a document is an elaboration of the important concepts contained in the document to some permissible extent and it is controlled by the given summary length. We have tested our proposed keyphrase-based summarization approach on two different datasets: one for English and another for Bengali. The experimental results show that the performance of the proposed system is comparable to some state-of-the art summarization systems.

World Wide Web Personalization

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch233 ◽

2011 ◽

pp. 1235-1241 ◽

Cited By ~ 23

Author(s):

Olfa Nasraoui

Keyword(s):

World Wide Web ◽

World Wide ◽

Information Overload ◽

Information Age ◽

Web Content ◽

Web Personalization ◽

Web Structure ◽

Web Information ◽

The World ◽

The Web

The Web information age has brought a dramatic increase in the sheer amount of information (Web content), in the access to this information (Web usage), and in the intricate complexities governing the relationships within this information (Web structure). Hence, not surprisingly, information overload when searching and browsing the World Wide Web (WWW) has become the plague du jour. One of the most promising and potent remedies against this plague comes in the form of personalization. Personalization aims to customize the interactions on a Web site, depending on the user’s explicit and/or implicit interests and desires.

The World-Wide Web information system: chemical chaos or global scientific enabler?

Laboratory Automation & Information Management ◽

10.1016/1381-141x(95)80015-u ◽

1995 ◽

Vol 31 (1) ◽

pp. 47-52

Author(s):

Henry S. Rzepa

Keyword(s):

Information System ◽

World Wide Web ◽

World Wide ◽

Web Information System ◽

Web Information ◽

The World

African Web Portals

Encyclopedia of Portal Technologies and Applications ◽

10.4018/978-1-59140-989-2.ch007 ◽

2011 ◽

pp. 41-46 ◽

Cited By ~ 1

Author(s):

Esharenana E. Adomi

Keyword(s):

Information Retrieval ◽

World Wide Web ◽

Rapid Growth ◽

World Wide ◽

Web Search ◽

Information Age ◽

Web Portals ◽

The Past ◽

The World ◽

The Web

The World Wide Web (WWW) has led to the advent of the information age. With increased demand for information from various quarters, the Web has turned out to be a veritable resource. Web surfers in the early days were frustrated by the delay in finding the information they needed. The first major leap for information retrieval came from the deployment of Web search engines such as Lycos, Excite, AltaVista, etc. The rapid growth in the popularity of the Web during the past few years has led to a precipitous pronouncement of death for the online services that preceded the Web in the wired world.

Hyperactive molecules and the World-Wide-Web information system

Journal of the Chemical Society Perkin Transactions 2 ◽

10.1039/p29950000007 ◽

1995 ◽

pp. 7 ◽

Cited By ~ 11

Author(s):

Omer Casher ◽

Gudge K. Chandramohan ◽

Martin J. Hargreaves ◽

Christopher Leach ◽

Peter Murray-Rust ◽

...

Keyword(s):

Information System ◽

World Wide Web ◽

World Wide ◽

Web Information System ◽

Web Information ◽

The World

A Signal-Representation-Based Parser to Extract Text-Based Information from the Web

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2010.p0531 ◽

2010 ◽

Vol 14 (5) ◽

pp. 531-539

Author(s):

Mu-Chun Su ◽

◽

Shao-Jui Wang ◽

Chen-Ko Huang ◽

Pa-ChunWang ◽

...

Keyword(s):

Web Services ◽

World Wide ◽

Information Sources ◽

State Of The Art ◽

Value Added ◽

Web Pages ◽

Web Page ◽

Web Information ◽

The World ◽

The Web

Most of the dramatically increased amount of information available on the World Wide Web is provided via HTML and formatted for human browsing rather than for software programs. This situation calls for a tool that automatically extracts information from semistructured Web information sources, increasing the usefulness of value-added Web services. We present a signal-representation-based parser (SIRAP) that breaks Web pages up into logically coherent groups - groups of information related to an entity, for example. Templates for records with different tag structures are generated incrementally by a Histogram-Based Correlation Coefficient (HBCC) algorithm, then records on a Web page are detected efficiently using templates generated by matching. Hundreds of Web pages from 17 state-of-the-art search engines were used to demonstrate the feasibility of our approach.

TextSumIt: a Semantic Single Document Summarization Model on Android Mobile Device

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.1902 ◽

2012 ◽

Vol 263-266 ◽

pp. 1902-1909

Author(s):

Oi Mean Foong ◽

Mellissa Lee

Keyword(s):

World Wide Web ◽

Mobile Devices ◽

Mobile Phones ◽

Mobile Device ◽

World Wide ◽

Information Overload ◽

Limited Time ◽

Preliminary Results ◽

The World ◽

Access Information

The explosion of information in the World Wide Web is overwhelming for readers with limitless information. Large internet articles or journals are often cumbersome to read as well as comprehend. More often than not, readers are immersed in a pool of information with limited time to assimilate all of the articles. As technology advances, it becomes more convenient to access information on-the-go, i.e., portability of information by utilizing mobile devices. In this research, a semantic and syntatic based summarization is implemented in a text summarizer to solve the information overload problem whilst providing a more coherent summary. The objective is to integrate WordNet into the proposed system aka TextSumIt which condenses lengthy documents into summarized text. The empirical experiments show that it produces satisfactory preliminary results on Android mobile phones.

Developing a Framework for Assessing Information Quality on the World Wide Web

10.28945/2854 ◽

2005 ◽

Cited By ~ 9

Author(s):

Shirlee-ann Knight ◽

Janice Burn

Keyword(s):

Information Retrieval ◽

World Wide Web ◽

Search Engine ◽

Rapid Growth ◽

Information Exchange ◽

Information Quality ◽

World Wide ◽

The Internet ◽

The World

The rapid growth of the Internet as an environment for information exchange and the lack of enforceable standards regarding the information it contains has lead to numerous information qual ity problems. A major issue is the inability of Search Engine technology to wade through the vast expanse of questionable content and return "quality" results to a user's query. This paper attempts to address some of the issues involved in determining what quality is, as it pertains to information retrieval on the Internet. The IQIP model is presented as an approach to managing the choice and implementation of quality related algorithms of an Internet crawling Search Engine.

Improving a State-of-the-Art Named Entity Recognition System Using the World Wide Web

Advances in Data Mining. Theoretical Aspects and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-540-73435-2_13 ◽

2007 ◽

pp. 163-172 ◽

Cited By ~ 1

Author(s):

Richárd Farkas ◽

György Szarvas ◽

Róbert Ormándi

Keyword(s):

World Wide Web ◽

World Wide ◽

State Of The Art ◽

Named Entity Recognition ◽

Recognition System ◽

Entity Recognition ◽

Named Entity ◽

The World

Ontology-Based Personalization of E-Government Services

Intelligent User Interfaces ◽

10.4018/978-1-60566-032-5.ch008 ◽

2011 ◽

pp. 167-187 ◽

Cited By ~ 6

Author(s):

Fabio Grandi ◽

Federica Mandreoli ◽

Riccardo Martoglia ◽

Enrico Ronchetti ◽

Maria Rita Scalas

Keyword(s):

World Wide ◽

Information Overload ◽

Good Reason ◽

Prototype System ◽

Web Pages ◽

Government Services ◽

Design And Implementation ◽

Web Information ◽

The World ◽

Web Information Systems

While the World Wide Web user is suffering form the disease caused by information overload, for which personalization is one of the treatments which work, the citizen who gets ready to use the e-Government services which are made available on the Web is not immune from contagion. This seems a good reason to try to prescribe a personalization treatment also to the e-Government user. Hence, we introduce the design and implementation of Web information systems supporting personalized access to multi-version resources in an e-Government scenario. Personalization is supported by means of Semantic Web techniques and relies on an ontology-based profiling of users (citizens). Resources we consider are collections of norm documents (laws, decrees, regulations, etc.) in XML format but can also be generic Web pages and portals or e-Government transactional services. We introduce a reference infrastructure, describe the organization and present performance figures of a prototype system we have developed.

Intelligent Agents and the World Wide Web

Intelligent Support Systems ◽

10.4018/978-1-931777-00-1.ch001 ◽

2002 ◽

pp. 1-3

Author(s):

Sudha Ram

Keyword(s):

World Wide Web ◽

Intelligent Agents ◽

World Wide ◽

Information Overload ◽

Dark Side ◽

Current Status ◽

End User ◽

Search Results ◽

The World ◽

Bright Side

We are fortunate to be experiencing an explosive growth and advancement in the Internet and the World Wide Web (WWW). In 1999, the global online population was estimated to be 250 million WWW users worldwide, while the “/images/spacer_white.gif”number of pages on the Web was estimated at 800 million (http://www.internetindicators.com/facts.html). The bright side of this kind of growth is that information is available to almost anyone with access to a computer and a phone line. However, the dark side of this explosion is that we are now squarely in the midst of the “Age of Information Overload”!!! The staggering amount of information has made it extremely difficult for users to locate and retrieve information that is actually relevant to their task at hand. Given the bewildering array of resources being generated and posted on the WWW, the task of finding exactly what a user wants is rather daunting. Although many search engines currently exist to assist in information retrieval, much of the burden of searching is on the end-user. A typical search results in millions of hit, many of which are outdated, irrelevant, or duplicated. One promising approach to managing the information overload problem is to use “intelligent agents” for search and retrieval. This editorial explores the current status of intelligent agents and points out some challenges in the development of intelligent agents based systems.