scholarly journals Big Data Cleaning Model of Multi-Source Heterogeneous Power Grid Based On Machine Learning Classification Algorithm

2021 ◽  
Vol 2087 (1) ◽  
pp. 012095
Author(s):  
Zhangchi Ying ◽  
Yuteng Huang ◽  
Ke Chen ◽  
Tianqi Yu

Abstract Aiming at the low cleaning rate of the traditional multi-source heterogeneous power grid big data cleaning model, a multi-source heterogeneous power grid big data cleaning model based on machine learning classification algorithm is designed. By capturing high-quality multi-source heterogeneous power grid big data, weight labeling of data source importance measurement, data attributes and tuples, and constructing Tan network based on the idea of machine learning classification algorithm, the data probability value is finally used to complete the classification and cleaning of inaccurate data. Experiments show that the model based on machine learning classification algorithm can effectively improve the imprecise data cleaning rate compared with the traditional model to solve multi-source heterogeneous imprecise data cleaning.

2021 ◽  
Vol 11 (2) ◽  
pp. 642-650
Author(s):  
C.S. Anita ◽  
P. Nagarajan ◽  
G. Aditya Sairam ◽  
P. Ganesh ◽  
G. Deepakkumar

With the pandemic situation, there is a strong rise in the number of online jobs posted on the internet in various job portals. But some of the jobs being posted online are actually fake jobs which lead to a theft of personal information and vital information. Thus, these fake jobs can be precisely detected and classified from a pool of job posts of both fake and real jobs by using advanced deep learning as well as machine learning classification algorithms. In this paper, machine learning and deep learning algorithms are used so as to detect fake jobs and to differentiate them from real jobs. The data analysis part and data cleaning part are also proposed in this paper, so that the classification algorithm applied is highly precise and accurate. It has to be noted that the data cleaning step is a very important step in machine learning project because it actually determines the accuracy of the machine learning as well as deep learning algorithms. Hence a great importance is emphasized on data cleaning and pre-processing step in this paper. The classification and detection of fake jobs can be done with high accuracy and high precision. Hence the machine learning and deep learning algorithms have to be applied on cleaned and pre-processed data in order to achieve a better accuracy. Further, deep learning neural networks are used so as to achieve higher accuracy. Finally all these classification models are compared with each other to find the classification algorithm with highest accuracy and precision.


Author(s):  
Tyler F. Rooks ◽  
Andrea S. Dargie ◽  
Valeta Carol Chancey

Abstract A shortcoming of using environmental sensors for the surveillance of potentially concussive events is substantial uncertainty regarding whether the event was caused by head acceleration (“head impacts”) or sensor motion (with no head acceleration). The goal of the present study is to develop a machine learning model to classify environmental sensor data obtained in the field and evaluate the performance of the model against the performance of the proprietary classification algorithm used by the environmental sensor. Data were collected from Soldiers attending sparring sessions conducted under a U.S. Army Combatives School course. Data from one sparring session were used to train a decision tree classification algorithm to identify good and bad signals. Data from the remaining sparring sessions were kept as an external validation set. The performance of the proprietary algorithm used by the sensor was also compared to the trained algorithm performance. The trained decision tree was able to correctly classify 95% of events for internal cross-validation and 88% of events for the external validation set. Comparatively, the proprietary algorithm was only able to correctly classify 61% of the events. In general, the trained algorithm was better able to predict when a signal was good or bad compared to the proprietary algorithm. The present study shows it is possible to train a decision tree algorithm using environmental sensor data collected in the field.


2020 ◽  
Vol 16 (1) ◽  
pp. 59-64
Author(s):  
Jaja Miharja ◽  
Jordy Lasmana Putra ◽  
Nur Hadianto

Analysis of hotel review sentiment is very helpful to be used as a benchmark or reference for making hotel business decisions today. However, all the review information obtained must be processed first by using an algorithm. The purpose of this study is to compare the Classification Algorithm of Machine Learning to obtain information that has a better level of accuracy in the analysis of hotel reviews. The algorithm that will be used is k-NN (k-Nearest Neighbor) and NB (Naive Bayes). After doing the calculation, the following accuracy level is obtained: k-NN of 60,50% with an AUC value of 0.632 and NB of 85,25% with an AUC value of 0.658. These results can be determined by the right algorithm to assist in making accurate decisions by business people in the analysis of hotel reviews using the NB Algorithm.


The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-21 ◽  
Author(s):  
Yuanjun Guo ◽  
Zhile Yang ◽  
Shengzhong Feng ◽  
Jinxing Hu

Efficient and valuable strategies provided by large amount of available data are urgently needed for a sustainable electricity system that includes smart grid technologies and very complex power system situations. Big Data technologies including Big Data management and utilization based on increasingly collected data from every component of the power grid are crucial for the successful deployment and monitoring of the system. This paper reviews the key technologies of Big Data management and intelligent machine learning methods for complex power systems. Based on a comprehensive study of power system and Big Data, several challenges are summarized to unlock the potential of Big Data technology in the application of smart grid. This paper proposed a modified and optimized structure of the Big Data processing platform according to the power data sources and different structures. Numerous open-sourced Big Data analytical tools and software are integrated as modules of the analytic engine, and self-developed advanced algorithms are also designed. The proposed framework comprises a data interface, a Big Data management, analytic engine as well as the applications, and display module. To fully investigate the proposed structure, three major applications are introduced: development of power grid topology and parallel computing using CIM files, high-efficiency load-shedding calculation, and power system transmission line tripping analysis using 3D visualization. The real-system cases demonstrate the effectiveness and great potential of the Big Data platform; therefore, data resources can achieve their full potential value for strategies and decision-making for smart grid. The proposed platform can provide a technical solution to the multidisciplinary cooperation of Big Data technology and smart grid monitoring.


Sign in / Sign up

Export Citation Format

Share Document