Hot News Recommendation System across Heterogonous Websites Using Hadoop

The current most news recommendations are suitable for news which comes from a single news website, not for news from different news websites. Little research work has been reported on utilizing hundreds of news websites to provide top hot news services for group customers (e.g. Government staffs). In this paper, we present hot news recommendation system based on Hadoop, which is from hundreds of different news websites. We discuss our news recommendation system architecture based on Hadoop.We conclude that Hadoop is an excellent tool for web big data analytics and scales well with increasing data set size and the number of nodes in the cluster. Experimental results demonstrate the reliability and effectiveness of our method.

Download Full-text

Hot News Recommendation System from Heterogeneous Websites Based on Bayesian Model

The Scientific World JOURNAL ◽

10.1155/2014/734351 ◽

2014 ◽

Vol 2014 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Zhengyou Xia ◽

Shengwu Xu ◽

Ningzhong Liu ◽

Zhengkang Zhao

Keyword(s):

Bayesian Model ◽

Recommendation System ◽

Research Work ◽

Joint Probability ◽

Real Data ◽

Online News ◽

City Government ◽

Data Sets ◽

News Websites ◽

News Recommendation

The most current news recommendations are suitable for news which comes from a single news website, not for news from different heterogeneous news websites. Previous researches about news recommender systems based on different strategies have been proposed to provide news personalization services for online news readers. However, little research work has been reported on utilizing hundreds of heterogeneous news websites to provide top hot news services for group customers (e.g., government staffs). In this paper, we propose a hot news recommendation model based on Bayesian model, which is from hundreds of different news websites. In the model, we determine whether the news is hot news by calculating the joint probability of the news. We evaluate and compare our proposed recommendation model with the results of human experts on the real data sets. Experimental results demonstrate the reliability and effectiveness of our method. We also implement this model in hot news recommendation system of Hangzhou city government in year 2013, which achieves very good results.

Download Full-text

Redistributed manufacturing system under uncertain evaluation using multi criteria decision making

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i3.1678 ◽

2021 ◽

Vol 12 (3) ◽

pp. 3895-3908

Author(s):

Ponugupati Narendra Mohan Et.al

Keyword(s):

Decision Making ◽

Research Work ◽

Dynamic Programming Algorithm ◽

Big Data Analytics ◽

Global Market ◽

Programming Algorithm ◽

Multi Criteria Decision Making ◽

Firm Level ◽

Data Set ◽

Value Analysis

Man In recent day’s occurrence of a global crisis in Environmental (Emission of pollutants) and in Health (Pandemic COVID-19) created a recession in all sectors. The innovations in technology lead to heavy competition in global market forcing to develop new variants especially in the automobile sector. This creates more turbulence in demand at the production of new models, maintenance of existing models that are obsolete while implementation of Bharat Standard automobile regulatory authority BS-VI of India. In this research work developed a novel model of value analysis is integrated by multi-objective function with multi-criteria decision-making analysis by incorporating the big data analytics with green supply chain management to bridge the gap in demand to an Indian manufacturing sector using a firm-level data set using matrix chain multiplication dynamic programming algorithm and the computational results illustrates that the algorithm proposed is effective.

Download Full-text

Roman Urdu Headline News Text Classification Using RNN, LSTM and CNN

Advances in Data Science and Adaptive Analysis ◽

10.1142/s2424922x20500084 ◽

2020 ◽

pp. 2050008

Author(s):

Irfan Ali Kandhro ◽

Sahar Zafar Jumani ◽

Kamlash Kumar ◽

Abdul Hafeez ◽

Fayyaz Ali

Keyword(s):

Text Classification ◽

Research Work ◽

Second Step ◽

Text Documents ◽

Data Set ◽

News Websites ◽

Testing Accuracy ◽

Headline News ◽

Automated Tool

This paper presents the automated tool for the classification of text with respect to predefined categories. It has always been considered as a vital method to manage and process a huge number of documents in digital forms which are widespread and continuously increasing. Most of the research work in text classification has been done in Urdu, English and other languages. But limited research work has been carried out on roman data. Technically, the process of the text classification follows two steps: the first step consists of choosing the main features from all the available features of the text documents with the usage of feature extraction techniques. The second step applies classification algorithms on those chosen features. The data set is collected through scraping tools from the most popular news websites Awaji Awaze and Daily Jhoongar. Furthermore, the data set splits in training and testing 70%, 30%, respectively. In this paper, the deep learning models, such as RNN, LSTM, and CNN, are used for classification of roman Urdu headline news. The testing accuracy of RNN (81%), LSTM (82%), and CNN (79%), and the experimental results demonstrate that the performance of the LSTM method is state-of-art method compared to CNN and RNN.

Download Full-text

Big Data Analytics using Swarm Intelligence based Framework for Prediction on Datasets

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d5298.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 7356-7360

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Information Source ◽

Research Work ◽

Big Data Analytics ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Data Set ◽

Raw Data

Data Analytics is a scientific as well as an engineering tool used to investigate the raw data to revamp the information to achieve knowledge. This is normally connected with obtaining knowledge from reliable information source and rapidity in information processing, and future prediction of the data analysis. Big Data analytics is strongly evolving with different features of volume, velocity and Vectors. Most of the organizations are now concentrating on analyzing information or raw data that are fascinated in deploying analytics to survive forthcoming issues and challenges. The prediction model or intelligent model is proposed in this research to apply machine learning algorithms in the data set. Then it is interpreted and to analyze the better forecast value of the study. The major objective of this research work is to find the optimum prediction from the medical data set using the machine learning techniques.

Download Full-text

Big Data-aware News Recommendation System According to Regional Twitter Users’ Interests

10.21203/rs.3.rs-392181/v1 ◽

2021 ◽

Author(s):

Maryam Bagheri ◽

Shahram Jamali ◽

Reza Fotohi

Keyword(s):

Big Data ◽

Recommendation System ◽

Research Area ◽

Small Data ◽

City Region ◽

Home Pages ◽

News Websites ◽

Twitter Users ◽

News Recommendation ◽

Spark Framework

Abstract Nowadays with the development of technology and access to the Internet everywhere for everyone, the interest to get the news from newspapers and other traditional media is decreasing. Therefore, the popularity of news websites is ascending as the newspapers are changing into electronic versions. News websites can be accessed from anywhere, i.e., any country, city, region, etc. So, the need to present the news depends on where the reader is from can be a research area, as with facing with variety of news topics on websites readers prefer to choose those which more often show the news, they are interested in on their home pages. Based on this idea we represent the technique to find favorite topics of Twitter users of certain geographical districts to provide news websites a way of increasing popularity. In this work we processed tweets. It seems that tweets are some small data, but we found out that processing this small data needs a lot of time, due to the repetition of the algorithm a lot and many searches to be done. Therefore, we categorized our work as big data. To help this problem we developed our work in the Spark framework. Our technique includes 2 phases; Feature Extraction Phase and Topic Discovery Phase. Our analysis shows that with this technique we can get the accuracy between 68% and 76%, in 3 developments 3-fold, 5-fold, and 10-fold.

Download Full-text

Movie Analytics for Effective Recommendation System using Pig with Hadoop

International Journal of Rough Sets and Data Analysis ◽

10.4018/ijrsda.2016040106 ◽

2016 ◽

Vol 3 (2) ◽

pp. 82-100 ◽

Cited By ~ 13

Author(s):

Arushi Jain ◽

Vishal Bhatnagar

Keyword(s):

Big Data ◽

Science Fiction ◽

Data Analytics ◽

Recommendation System ◽

18Th Century ◽

Big Data Analytics ◽

Data Set ◽

The People ◽

Late 18Th Century

Movies have been a great source of entertainment for the people ever since their inception in the late 18th century. The term movie is very broad and its definition contains language and genres such as drama, comedy, science fiction and action. The data about movies over the years is very vast and to analyze it, there is a need to break away from the traditional analytics techniques and adopt big data analytics. In this paper the authors have taken the data set on movies and analyzed it against various queries to uncover real nuggets from the dataset for effective recommendation system and ratings for the upcoming movies.

Download Full-text

ANN-Based Relaying Algorithm for Protection of SVC- Compensated AC Transmission Line and Criticality Analysis of a Digital Relay

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190307163818 ◽

2020 ◽

Vol 13 (3) ◽

pp. 381-393

Author(s):

Farhana Fayaz ◽

Gobind Lal Pahuja

Keyword(s):

Transmission Line ◽

Failure Modes ◽

Reactive Power ◽

Circuit Breaker ◽

Research Work ◽

Back Propagation ◽

Static Var Compensator ◽

Data Set ◽

Ac Transmission ◽

Criticality Analysis

Background:The Static VAR Compensator (SVC) has the capability of improving reliability, operation and control of the transmission system thereby improving the dynamic performance of power system. SVC is a widely used shunt FACTS device, which is an important tool for the reactive power compensation in high voltage AC transmission systems. The transmission lines compensated with the SVC may experience faults and hence need a protection system against the damage caused by these faults as well as provide the uninterrupted supply of power.Methods:The research work reported in the paper is a successful attempt to reduce the time to detect faults on a SVC-compensated transmission line to less than quarter of a cycle. The relay algorithm involves two ANNs, one for detection and the other for classification of faults, including the identification of the faulted phase/phases. RMS (Root Mean Square) values of line voltages and ratios of sequence components of line currents are used as inputs to the ANNs. Extensive training and testing of the two ANNs have been carried out using the data generated by simulating an SVC-compensated transmission line in PSCAD at a signal sampling frequency of 1 kHz. Back-propagation method has been used for the training and testing. Also the criticality analysis of the existing relay and the modified relay has been done using three fault tree importance measures i.e., Fussell-Vesely (FV) Importance, Risk Achievement Worth (RAW) and Risk Reduction Worth (RRW).Results:It is found that the relay detects any type of fault occurring anywhere on the line with 100% accuracy within a short time of 4 ms. It also classifies the type of the fault and indicates the faulted phase or phases, as the case may be, with 100% accuracy within 15 ms, that is well before a circuit breaker can clear the fault. As demonstrated, fault detection and classification by the use of ANNs is reliable and accurate when a large data set is available for training. The results from the criticality analysis show that the criticality ranking varies in both the designs (existing relay and the existing modified relay) and the ranking of the improved measurement system in the modified relay changes from 2 to 4.Conclusion:A relaying algorithm is proposed for the protection of transmission line compensated with Static Var Compensator (SVC) and criticality ranking of different failure modes of a digital relay is carried out. The proposed scheme has significant advantages over more traditional relaying algorithms. It is suitable for high resistance faults and is not affected by the inception angle nor by the location of fault.

Download Full-text

Outlier Detection Transilience-Probabilistic Model for Wind Tunnels Based on Sensor Data

Sensors ◽

10.3390/s21072532 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2532

Author(s):

Encarna Quesada ◽

Juan J. Cuadrado-Gallego ◽

Miguel Ángel Patricio ◽

Luis Usero

Keyword(s):

Anomaly Detection ◽

Outlier Detection ◽

Research Work ◽

Industrial Sector ◽

Sensor Data ◽

Correct Selection ◽

Wind Tunnels ◽

Data Set ◽

First Case ◽

Anomaly Analysis

Anomaly Detection research is focused on the development and application of methods that allow for the identification of data that are different enough—compared with the rest of the data set that is being analyzed—and considered anomalies (or, as they are more commonly called, outliers). These values mainly originate from two sources: they may be errors introduced during the collection or handling of the data, or they can be correct, but very different from the rest of the values. It is essential to correctly identify each type as, in the first case, they must be removed from the data set but, in the second case, they must be carefully analyzed and taken into account. The correct selection and use of the model to be applied to a specific problem is fundamental for the success of the anomaly detection study and, in many cases, the use of only one model cannot provide sufficient results, which can be only reached by using a mixture model resulting from the integration of existing and/or ad hoc-developed models. This is the kind of model that is developed and applied to solve the problem presented in this paper. This study deals with the definition and application of an anomaly detection model that combines statistical models and a new method defined by the authors, the Local Transilience Outlier Identification Method, in order to improve the identification of outliers in the sensor-obtained values of variables that affect the operations of wind tunnels. The correct detection of outliers for the variables involved in wind tunnel operations is very important for the industrial ventilation systems industry, especially for vertical wind tunnels, which are used as training facilities for indoor skydiving, as the incorrect performance of such devices may put human lives at risk. In consequence, the use of the presented model for outlier detection may have a high impact in this industrial sector. In this research work, a proof-of-concept is carried out using data from a real installation, in order to test the proposed anomaly analysis method and its application to control the correct performance of wind tunnels.

Download Full-text

A Comparison Study of Mahalanobis-Taguchi System and Neural Network for Multivariate Pattern Recognition

Design Engineering, Parts A and B ◽

10.1115/imece2005-80029 ◽

2005 ◽

Cited By ~ 10

Author(s):

Jungeui Hong ◽

Elizabeth A. Cudney ◽

Genichi Taguchi ◽

Rajesh Jugulum ◽

Kioumars Paryani ◽

...

Keyword(s):

Neural Network ◽

Small Data ◽

Data Sets ◽

Comparison Study ◽

Data Set ◽

Set Size ◽

Breast Cancer Study ◽

Discriminant Ability ◽

Small Data Sets ◽

Multivariate Pattern

The Mahalanobis-Taguchi System is a diagnosis and predictive method for analyzing patterns in multivariate cases. The goal of this study is to compare the ability of the Mahalanobis-Taguchi System and a neural network to discriminate using small data sets. We examine the discriminant ability as a function of data set size using an application area where reliable data is publicly available. The study uses the Wisconsin Breast Cancer study with nine attributes and one class.

Download Full-text

Effect of data set size on geochemical quantification accuracy with laser-induced breakdown spectroscopy

Spectrochimica Acta Part B Atomic Spectroscopy ◽

10.1016/j.sab.2021.106073 ◽

2021 ◽

Vol 177 ◽

pp. 106073

Author(s):

M. Darby Dyar ◽

Cai R. Ytsma

Keyword(s):

Laser Induced Breakdown Spectroscopy ◽

Data Set ◽

Breakdown Spectroscopy ◽

Set Size ◽

Laser Induced Breakdown ◽

Quantification Accuracy

Download Full-text