A Study on Tools and Techniques of Big Data Analytics for Text Summarization From Multi-Documents

Author(s):  
Martin Aruldoss ◽  
Miranda Lakshmi Travis

Multi-document summarization extracts and summarizes the information without affecting its original context from the different sources of documents. It has been carried out using extractive text summarization and abstractive text summarization. Extractive summarization extracts summaries from verbatim lines, and abstractive summarization extracts new lines of summary from the source documents. Abstractive summarization is an advanced technology compared to extractive summarization. This research studies extractive summarization of multi documents from internet resources using word frequency counting and with maximum coverage using K-means clustering. In an internet search, the search algorithm shows the results from different websites using crawling and indexing. However, the search and text summary take place from hundreds, thousands, maybe millions of documents. To handle and manipulate these huge amounts of information, big data and its techniques are applied widely. This research also addresses big data techniques and tools that are available for multi-document summarization.

Author(s):  
Manbir Sandhu ◽  
Purnima, Anuradha Saini

Big data is a fast-growing technology that has the scope to mine huge amount of data to be used in various analytic applications. With large amount of data streaming in from a myriad of sources: social media, online transactions and ubiquity of smart devices, Big Data is practically garnering attention across all stakeholders from academics, banking, government, heath care, manufacturing and retail. Big Data refers to an enormous amount of data generated from disparate sources along with data analytic techniques to examine this voluminous data for predictive trends and patterns, to exploit new growth opportunities, to gain insight, to make informed decisions and optimize processes. Data-driven decision making is the essence of business establishments. The explosive growth of data is steering the business units to tap the potential of Big Data to achieve fueling growth and to achieve a cutting edge over their competitors. The overwhelming generation of data brings with it, its share of concerns. This paper discusses the concept of Big Data, its characteristics, the tools and techniques deployed by organizations to harness the power of Big Data and the daunting issues that hinder the adoption of Business Intelligence in Big Data strategies in organizations.


Author(s):  
Sheik Abdullah A. ◽  
Priyadharshini P.

The term Big Data corresponds to a large dataset which is available in different forms of occurrence. In recent years, most of the organizations generate vast amounts of data in different forms which makes the context of volume, variety, velocity, and veracity. Big Data on the volume aspect is based on data set maintenance. The data volume goes to processing usual a database but cannot be handled by a traditional database. Big Data is stored among structured, unstructured, and semi-structured data. Big Data is used for programming, data warehousing, computational frameworks, quantitative aptitude and statistics, and business knowledge. Upon considering the analytics in the Big Data sector, predictive analytics and social media analytics are widely used for determining the pattern or trend which is about to happen. This chapter mainly deals with the tools and techniques that corresponds to big data analytics of various applications.


Big Data ◽  
2016 ◽  
pp. 1247-1259 ◽  
Author(s):  
Jayanthi Ranjan

Big data is in every industry. It is being utilized in almost all business functions within these industries. Basically, it creates value by converting human decisions into transformed automated algorithms using various tools and techniques. In this chapter, the authors look towards big data analytics from the healthcare perspective. Healthcare involves the whole supply chain of industries from the pharmaceutical companies to the clinical research centres, from the hospitals to individual physicians, and anyone who is involved in the medical arena right from the supplier to the consumer (i.e. the patient). The authors explore the growth of big data analytics in the healthcare industry including its limitations and potential.


Author(s):  
Balamurugan Balusamy ◽  
Priya Jha ◽  
Tamizh Arasi ◽  
Malathi Velu

Big data analytics in recent years had developed lightning fast applications that deal with predictive analysis of huge volumes of data in domains of finance, health, weather, travel, marketing and more. Business analysts take their decisions using the statistical analysis of the available data pulled in from social media, user surveys, blogs and internet resources. Customer sentiment has to be taken into account for designing, launching and pricing a product to be inducted into the market and the emotions of the consumers changes and is influenced by several tangible and intangible factors. The possibility of using Big data analytics to present data in a quickly viewable format giving different perspectives of the same data is appreciated in the field of finance and health, where the advent of decision support system is possible in all aspects of their working. Cognitive computing and artificial intelligence are making big data analytical algorithms to think more on their own, leading to come out with Big data agents with their own functionalities.


Text Summarization is one of those utilizations of Natural Language Processing (NLP) which will undoubtedly hugy affect our lives. For the most part, Text outline can comprehensively be partitioned into two classifications, Extractive Summarization and Abstractive Summarization and the execution of seq2seq model for rundown of literary information utilizing of tensor stream/keras and showed on amazon or social reaction surveys, issues and news stories. Content rundown is a subdomain of Natural Language Processing that manages removing synopses from tremendous lumps of writings. There are two fundamental sorts of methods utilized for content rundown: NLP-based procedures and profound learning based strategies. Along these lines, our point is to look at spacy, gensim and nltk synopsis system by the info prerequisites. It will see a basic NLP-based system for content rundown. Or maybe it will basically utilize Python's NLTK library for content abridging.


2021 ◽  
Vol 40 ◽  
pp. 03023
Author(s):  
Saurabh Varade ◽  
Ejaaz Sayyed ◽  
Vaibhavi Nagtode ◽  
Shilpa Shinde

Text Summarization is a process where a huge text file is converted into summarized version which will preserve the original meaning and context. The main aim of any text summarization is to provide a accurate and precise summary. One approach is to use a sentence ranking algorithm. This comes under extractive summarization. Here, a graph based ranking algorithm is used to rank the sentences in the text and then top k-scored sentences are included in the summary. The most widely used algorithm to decide the importance of any vertex in a graph based on the information retrieved from the graph is Graph Based Ranking Algorithm. TextRank is one of the most efficient ranking algorithms which is used for Web link analysis that is for measuring the importance of website pages. Another approach is abstractive summarization where a LSTM encoder decoder model is used along with attention mechanism which focuses on some important words from the input. Encoder encodes the input sequence and decoder along with attention mechanism gives the summary as the output.


Author(s):  
Adarsh Bhandari

Abstract: With the rapid escalation of data driven solutions, companies are integrating huge data from multiple sources in order to gain fruitful results. To handle this tremendous volume of data we need cloud based architecture to store and manage this data. Cloud computing has emerged as a significant infrastructure that promises to reduce the need for maintaining costly computing facilities by organizations and scale up the products. Even today heavy applications are deployed on cloud and managed specially at AWS eliminating the need for error prone manual operations. This paper demonstrates about certain cloud computing tools and techniques present to handle big data and processes involved while extracting this data till model deployment and also distinction among their usage. It will also demonstrate, how big data analytics and cloud computing will change methods that will later drive the industry. Additionally, a study is presented later in the paper about management of blockchain generated big data on cloud and making analytical decision. Furthermore, the impact of blockchain in cloud computing and big data analytics has been employed in this paper. Keywords: Cloud Computing, Big Data, Amazon Web Services (AWS), Google Cloud Platform (GCP), SaaS, PaaS, IaaS.


2017 ◽  
Vol 4 (4) ◽  
pp. 21-47 ◽  
Author(s):  
Surabhi Verma

The insights that firms gain from big data analytics (BDA) in real time is used to direct, automate and optimize the decision making to successfully achieve their organizational goals. Data management (DM) and advance analytics (AA) tools and techniques are some of the key contributors to making BDA possible. This paper aims to investigate the characteristics of BD, processes of data management, AA techniques, applications across sectors and issues that are related to their effective implementation and management within broader context of BDA. A range of recently published literature on the characteristics of BD, DM processes, AA techniques are reviewed to explore their current state, applications, issues and challenges learned from their practice. The finding discusses different characteristics of BD, a framework for BDA using data management processes and AA techniques. It also discusses the opportunities/applications and challenges managers dealing with these technologies face for gaining competitive advantages in businesses. The study findings are intended to assist academicians and managers in effectively quantifying the data available in an organization into BD by understanding its properties, understanding the emerging technologies, applications and issues behind BDA implementation.


Author(s):  
Vinay Kellengere Shankarnarayan

In recent years, big data have gained massive popularity among researchers, decision analysts, and data architects in any enterprise. Big data had been just another way of saying analytics. In today's world, the company's capital lies with big data. Think of worlds huge companies. The value they offer comes from their data, which they analyze for their proactive benefits. This chapter showcases the insight of big data and its tools and techniques the companies have adopted to deal with data problems. The authors also focus on framework and methodologies to handle the massive data in order to make more accurate and precise decisions. The chapter begins with the current organizational scenario and what is meant by big data. Next, it draws out various challenges faced by organizations. The authors also observe big data business models and different frameworks available and how it has been categorized and finally the conclusion discusses the challenges and what is the future perspective of this research area.


Sign in / Sign up

Export Citation Format

Share Document