Movie Analytics for Effective Recommendation System using Pig with Hadoop

2016 ◽  
Vol 3 (2) ◽  
pp. 82-100 ◽  
Author(s):  
Arushi Jain ◽  
Vishal Bhatnagar

Movies have been a great source of entertainment for the people ever since their inception in the late 18th century. The term movie is very broad and its definition contains language and genres such as drama, comedy, science fiction and action. The data about movies over the years is very vast and to analyze it, there is a need to break away from the traditional analytics techniques and adopt big data analytics. In this paper the authors have taken the data set on movies and analyzed it against various queries to uncover real nuggets from the dataset for effective recommendation system and ratings for the upcoming movies.

Author(s):  
Yihao Tian

Big data is an unstructured data set with a considerable volume, coming from various sources such as the internet, business organizations, etc., in various formats. Predicting consumer behavior is a core responsibility for most dealers. Market research can show consumer intentions; it can be a big order for a best-designed research project to penetrate the veil, protecting real customer motivations from closer scrutiny. Customer behavior usually focuses on customer data mining, and each model is structured at one stage to answer one query. Customer behavior prediction is a complex and unpredictable challenge. In this paper, advanced mathematical and big data analytical (BDA) methods to predict customer behavior. Predictive behavior analytics can provide modern marketers with multiple insights to optimize efforts in their strategies. This model goes beyond analyzing historical evidence and making the most knowledgeable assumptions about what will happen in the future using mathematical. Because the method is complex, it is quite straightforward for most customers. As a result, most consumer behavior models, so many variables that produce predictions that are usually quite accurate using big data. This paper attempts to develop a model of association rule mining to predict customers’ behavior, improve accuracy, and derive major consumer data patterns. The finding recommended BDA method improves Big data analytics usability in the organization (98.2%), risk management ratio (96.2%), operational cost (97.1%), customer feedback ratio (98.5%), and demand prediction ratio (95.2%).


Author(s):  
Dennis T. Kennedy ◽  
Dennis M. Crossen ◽  
Kathryn A. Szabat

Big Data Analytics has changed the way organizations make decisions, manage business processes, and create new products and services. Business analytics is the use of data, information technology, statistical analysis, and quantitative methods and models to support organizational decision making and problem solving. The main categories of business analytics are descriptive analytics, predictive analytics, and prescriptive analytics. Big Data is data that exceeds the processing capacity of conventional database systems and is typically defined by three dimensions known as the Three V's: Volume, Variety, and Velocity. Big Data brings big challenges. Big Data not only has influenced the analytics that are utilized but also has affected technologies and the people who use them. At the same time Big Data brings challenges, it presents opportunities. Those who embrace Big Data and effective Big Data Analytics as a business imperative can gain competitive advantage.


2019 ◽  
Vol 8 (3) ◽  
pp. 1572-1580

Tourism is one of the most important sectors contributing towards the economic growth of India. Big data analytics in the recent times is being applied in the tourism sector for the activities like tourism demand forecasting, prediction of interests of tourists’, identification of tourist attraction elements and behavioural patterns. The major objective of this study is to demonstrate how big data analytics could be applied in predicting the travel behaviour of International and Domestic tourists. The significance of machine learning algorithms and techniques in processing the big data is also important. Thus, the combination of machine learning and big data is the state-of-art method which has been acclaimed internationally. While big data analytics and its application with respect to the tourism industry has attracted few researchers interest in the present times, there have been not much researches on this area of study particularly with respect to the scenario of India. This study intends to describe how big data analytics could be used in forecasting Indian tourists travel behaviour. To add much value to the research this study intends to categorize on what grounds the tourists chose domestic tourism and on what grounds they chose international tourism. The online datasets on places reviews from cities namely Chicago, Beijing, New York, Dubai, San Francisco, London, New Delhi and Shanghai have been gathered and an associative rule mining based algorithm has been applied on the data set in order to attain the objectives of the study


2019 ◽  
Vol 8 (S3) ◽  
pp. 35-40
Author(s):  
S. Mamatha ◽  
T. Sudha

In this digital world, as organizations are evolving rapidly with data centric asset the explosion of data and size of the databases have been growing exponentially. Data is generated from different sources like business processes, transactions, social networking sites, web servers, etc. and remains in structured as well as unstructured form. The term ― Big data is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data varies in size ranging from a few dozen terabytes to many petabytes of data in a single data set. Difficulties include capture, storage, search, sharing, analytics and visualizing. Big data is available in structured, unstructured and semi-structured data format. Relational database fails to store this multi-structured data. Apache Hadoop is efficient, robust, reliable and scalable framework to store, process, transforms and extracts big data. Hadoop framework is open source and fee software which is available at Apache Software Foundation. In this paper we will present Hadoop, HDFS, Map Reduce and c-means big data algorithm to minimize efforts of big data analysis using Map Reduce code. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools and related fields.


2020 ◽  
Vol 8 (6) ◽  
pp. 3704-3708

Big data analytics is a field in which we analyse and process information from large or convoluted data sets to be managed by methods of data-processing. Big data analytics is used in analysing the data and helps in predicting the best outcome from the data sets. Big data analytics can be very useful in predicting crime and also gives the best possible solution to solve that crime. In this system we will be using the past crime data set to find out the pattern and through that pattern we will be predicting the range of the incident. The range of the incident will be determined by the decision model and according to the range the prediction will be made. The data sets will be nonlinear and in the form of time series so in this system we will be using the prophet model algorithm which is used to analyse the non-linear time series data. The prophet model categories in three main category and i.e. trends, seasonality, and holidays. This system will help crime cell to predict the possible incident according to the pattern which will be developed by the algorithm and it also helps to deploy right number of resources to the highly marked area where there is a high chance of incidents to occur. The system will enhance the crime prediction system and will help the crime department to use their resources more efficiently.


Author(s):  
Anurag Sinha ◽  
Arinjay Biswas ◽  
Tushar Raj ◽  
Aditya Misra

2019 ◽  
Vol 8 (4) ◽  
pp. 7356-7360

Data Analytics is a scientific as well as an engineering tool used to investigate the raw data to revamp the information to achieve knowledge. This is normally connected with obtaining knowledge from reliable information source and rapidity in information processing, and future prediction of the data analysis. Big Data analytics is strongly evolving with different features of volume, velocity and Vectors. Most of the organizations are now concentrating on analyzing information or raw data that are fascinated in deploying analytics to survive forthcoming issues and challenges. The prediction model or intelligent model is proposed in this research to apply machine learning algorithms in the data set. Then it is interpreted and to analyze the better forecast value of the study. The major objective of this research work is to find the optimum prediction from the medical data set using the machine learning techniques.


Author(s):  
Naji Albakay ◽  
Michael Hempel ◽  
Hamid Sharif

Rolling stock, particularly of freight railroads, is currently maintained using regular preventative and corrective maintenance schedules. This maintenance approach recommends sets of inspections and maintenance procedures based on the average expected wear and tear across their inventory. In practice, however, this approach to scheduling preventative maintenance is not always effective. When scheduled too soon, it results in a loss of operating revenue, whereas when it is scheduled too late, equipment failure could lead to costly and disastrous derailments. Instead, proactive maintenance scheduling based on Big Data Analytics (BDA) could be utilized to replace traditional scheduling, resulting in optimized maintenance cycles for higher train safety, availability, and reliability. BDA could also be used to discover patterns and relationships that lead to train failures, identify manufacturer reliability concerns, and help validate the effectiveness of operational improvements. In this work, we introduce a train inventory simulation platform that enables the modelling of different train components such as wheels, brakes, axles, and bearings. The simulator accounts for the wear and tear in each component and generates a comprehensive data set suitable for BDA that can be used to evaluate the effectiveness of different BDA approaches in discerning patterns and extracting knowledge from the data. It provides the basis for showing that BDA algorithms such as Random Forest [9] and Linear Regression can be utilized to create models for proactive train maintenance scheduling. We also show the capability of BDA to detect hidden patterns and to predict failure of train components with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document