scholarly journals Text And Sentimental Analysis On Big Data

Author(s):  
Saifuzzafar Jaweed Ahmed

Big Data has become a very important part of all industries and organizations sectors nowadays. All sectors like energy, banking, retail, hardware, networking, etc all generate a huge amount of unstructured data which is processed and analyzed accurately in a structured form. Then the structured data can reveal very useful information for their business growth. Big Data helps in getting useful data from unstructured or heterogeneous data by analyzing them. Big data initially defined by the volume of a data set. Big data sets are generally huge, measuring tens of terabytes and sometimes crossing the sting of petabytes. Today, big data falls under three categories structured, unstructured, and semi-structured. The size of big data is improving in a fast phase from Terabytes to Exabytes Of data. Also, Big data requires techniques that help to integrate a huge amount of heterogeneous data and to process them. Data Analysis which is a big data process has its applications in various areas such as business processing, disease prevention, cybersecurity, and so on. Big data has three major issues such as data storage, data management, and information retrieval. Big data processing requires a particular setup of hardware and virtual machines to derive results. The processing is completed simultaneously to realize results as quickly as possible. These days big data processing techniques include Text mining and sentimental analysis. Text analytics is a very large field under which there are several techniques, models, methods for automatic and quantitative analysis of textual data. The purpose of this paper is to show how the text analysis and sentimental analysis process the unstructured data and how these techniques extract meaningful information and, thus make information available to the various data mining statistical and machine learning) algorithms.

Author(s):  
Jaimin N. Undavia ◽  
Atul Patel ◽  
Sheenal Patel

Availability of huge amount of data has opened up a new area and challenge to analyze these data. Analysis of these data become essential for each organization and these analyses may yield some useful information for their future prospectus. To store, manage and analyze such huge amount of data traditional database systems are not adequate and not capable also, so new data term is introduced – “Big Data”. This term refers to huge amount of data which are used for analytical purpose and future prediction or forecasting. Big Data may consist of combination of structured, semi structured or unstructured data and managing such data is a big challenge in current time. Such heterogeneous data is required to maintained in very secured and specific way. In this chapter, we have tried to identify such challenges and issues and also tried to resolve it with specific tools.


Author(s):  
Anwar H. Katrawi ◽  
Rosni Abdullah ◽  
Mohammed Anbar ◽  
Ibrahim AlShourbaji ◽  
Ammar Kamal Abasi

The proliferation of information technology produces a huge amount of data called big data that cannot be processed by traditional database systems. These Various types of data come from different sources. However, stragglers are a major bottleneck in big data processing, and hence the early detection and accurate identification of stragglers can have important impacts on the performance of big data processing. This work aims to assess five stragglers identification methods: Hadoop native scheduler, LATE Scheduler, Mantri, MonTool, and Dolly. The performance of these techniques was evaluated based on three benchmarked methods: Sort, Grep and WordCount. The results show that the LATE Scheduler performs the best and it would be efficient to obtain better results for stragglers identification.


Author(s):  
Jaimin N. Undavia ◽  
Atul Patel ◽  
Sheenal Patel

Availability of huge amount of data has opened up a new area and challenge to analyze these data. Analysis of these data become essential for each organization and these analyses may yield some useful information for their future prospectus. To store, manage and analyze such huge amount of data traditional database systems are not adequate and not capable also, so new data term is introduced – “Big Data”. This term refers to huge amount of data which are used for analytical purpose and future prediction or forecasting. Big Data may consist of combination of structured, semi structured or unstructured data and managing such data is a big challenge in current time. Such heterogeneous data is required to maintained in very secured and specific way. In this chapter, we have tried to identify such challenges and issues and also tried to resolve it with specific tools.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Geetanjali Rathee ◽  
Adel Khelifi ◽  
Razi Iqbal

The automated techniques enabled with Artificial Neural Networks (ANN), Internet of Things (IoT), and cloud-based services affect the real-time analysis and processing of information in a variety of applications. In addition, multihoming is a type of network that combines various types of networks into a single environment while managing a huge amount of data. Nowadays, the big data processing and monitoring in multihoming networks provide less attention while reducing the security risk and efficiency during processing or monitoring the information. The use of AI-based systems in multihoming big data with IoT- and AI-integrated systems may benefit in various aspects. Although multihoming security issues and their analysis have been well studied by various scientists and researchers; however, not much attention is paid towards big data security processing in multihoming especially using automated techniques and systems. The aim of this paper is to propose an IoT-based artificial network to process and compute big data processing by ensuring a secure communication multihoming network using the Bayesian Rule (BR) and Levenberg-Marquardt (LM) algorithms. Further, the efficiency and effect on multihoming information processing using an AI-assisted mechanism are experimented over various parameters such as classification accuracy, classification time, specificity, sensitivity, ROC, and F -measure.


The real challenge for data miners lies in extracting useful information from huge datasets. Moreover, choosing an efficient algorithm to analyze and process these unstructured data is itself a challenge. Cluster analysis is an unsupervised practice to attain data insight in the era of Big Data. Hyperflated PIC is a Big Data processing solution designed to take advantage over clustering. It is a scalable efficient algorithm to address the shortcomings of existing clustering algorithm and it can process huge datasets quickly. HPIC algorithms have been validated by experimenting them with synthetic and real datasets using different evaluation measure. The quality of clustering results has also been analyzed and proved to be highly efficient and suitable for Big Data processing.


2019 ◽  
Vol 12 (1) ◽  
pp. 42 ◽  
Author(s):  
Andrey I. Vlasov ◽  
Konstantin A. Muraviev ◽  
Alexandra A. Prudius ◽  
Demid A. Uzenkov

Sign in / Sign up

Export Citation Format

Share Document