scholarly journals Comparison of data mining algorithms for pressure prediction of crude oil pipeline to identify congeal

2021 ◽  
Vol 325 ◽  
pp. 02002
Author(s):  
Agus Santoso ◽  
F. Danang Wijaya ◽  
Noor Akhmad Setiawan ◽  
Joko Waluyo

Data mining is applied in many areas. In oil and gas industries, data mining may be implemented to support the decision making in their operation to prevent a massive loss. One of serious problems in the petroleum industry is congeal phenomenon, since it leads to block crude oil flow during transport in a pipeline system. In the crude oil pipeline system, pressure online monitoring in the pipeline is usually implemented to control the congeal phenomenon. However, this system is not able to predict the pipeline pressure on the next several days. This research is purposed to compare the pressure prediction of the crude oil pipeline using data mining algorithms based on the real historical data from the petroleum field. To find the best algorithms, it was compared 4 data mining algorithms, i.e. Random Forest, Multilayer Perceptron (MLP), Decision Tree, and Linear Regression. As a result, the Linear Regression shows the best performance among the 4 algorithms with R2 = 0.55 and RMSE = 28.34. This research confirmed that data mining algorithm is a good method to be implemented in petroleum industry to predict the pressure of the crude oil pipeline, even the accuracy of the prediction values should be improved. To have better accuracy, it is necessary to collect more data and find better performance of the data mining algorithm

Author(s):  
Zhi-Hua Zhou

Data mining attempts to identify valid, novel, potentially useful, and ultimately understandable patterns from huge volume of data. The mined patterns must be ultimately understandable because the purpose of data mining is to aid decision-making. If the decision-makers cannot understand what does a mined pattern mean, then the pattern cannot be used well. Since most decision-makers are not data mining experts, ideally, the patterns should be in a style comprehensible to common people. So, comprehensibility of data mining algorithms, that is, the ability of a data mining algorithm to produce patterns understandable to human beings, is an important factor.


Author(s):  
TZUNG-PEI HONG ◽  
CHAN-SHENG KUO ◽  
SHENG-CHAI CHI

Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values. Transactions with quantitative values are however commonly seen in real-world applications. We proposed a fuzzy mining algorithm by which each attribute used only the linguistic term with the maximum cardinality int he mining process. The number of items was thus the same as that of the original attributes, making the processing time reduced. The fuzzy association rules derived in this way are not complete. This paper thus modifies it and proposes a new fuzzy data-mining algorithm for extrating interesting knowledge from transactions stored as quantitative values. The proposed algorithm can derive a more complete set of rules but with more computation time than the method proposed. Trade-off thus exists between the computation time and the completeness of rules. Choosing an appropriate learning method thus depends on the requirement of the application domains.


2018 ◽  
Vol 48 (4) ◽  
pp. 281-285
Author(s):  
Y. J. HAO

The data mining algorithm based on cloud computing is studied and analyzed in this paper. Firstly, the research status and background of the data mining algorithms based on cloud computing are introduced briefly. Secondly, the design of Hash algorithm under cellular neural network is introduced which is needed in this paper. Next, the design of wavelet data compression algorithm for wireless sensor networks is described. Finally, the experimental results and the optimization similarity analysis are obtained. The analysis results show that the data mining algorithm based on cloud computing constructed in this paper plays an important role in data mining, and can improve the data mining algorithm of cloud computing and the development level of cloud computing technology and big data technology to some extent.


Author(s):  
Moloud Abdar ◽  
Sharareh R. Niakan Kalhori ◽  
Tole Sutikno ◽  
Imam Much Ibnu Subroto ◽  
Goli Arji

Heart diseases are among the nation’s leading couse of mortality and moribidity. Data mining teqniques can predict the likelihood of patients getting a heart disease. The purpose of this study is comparison of different data mining algorithm on prediction of heart diseases. This work applied and compared data mining techniques to predict the risk of heart diseases. After feature analysis, models by five algorithms including decision tree (C5.0), neural network, support vector machine (SVM), logistic regression and k-nearest neighborhood (KNN) were developed and validated. C5.0 Decision tree has been able to build a model with greatest accuracy 93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23% respectively. Produced results of decision tree can be simply interpretable and applicable; their rules can be understood easily by different clinical practitioner.


Author(s):  
Seyed Mohammad Ayyoubzadeh ◽  
Seyed Mehdi Ayyoubzadeh ◽  
Hoda Zahedi ◽  
Mahnaz Ahmadi ◽  
Sharareh R Niakan Kalhori

BACKGROUND The recent global outbreak of coronavirus disease (COVID-19) is affecting many countries worldwide. Iran is one of the top 10 most affected countries. Search engines provide useful data from populations, and these data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources’ data might provide a better insight into the COVID-19 outbreak to manage the health crisis in each country and worldwide. OBJECTIVE This study aimed to predict the incidence of COVID-19 in Iran. METHODS Data were obtained from the Google Trends website. Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases. All models were evaluated using 10-fold cross-validation, and root mean square error (RMSE) was used as the performance metric. RESULTS The linear regression model predicted the incidence with an RMSE of 7.562 (SD 6.492). The most effective factors besides previous day incidence included the search frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was 27.187 (SD 20.705). CONCLUSIONS Data mining algorithms can be employed to predict trends of outbreaks. This prediction might support policymakers and health care managers to plan and allocate health care resources accordingly.


2018 ◽  
Vol 7 (3.4) ◽  
pp. 13
Author(s):  
Gourav Bathla ◽  
Himanshu Aggarwal ◽  
Rinkle Rani

Data mining is one of the most researched fields in computer science. Several researches have been carried out to extract and analyse important information from raw data. Traditional data mining algorithms like classification, clustering and statistical analysis can process small scale of data with great efficiency and accuracy. Social networking interactions, business transactions and other communications result in Big data. It is large scale of data which is not in competency for traditional data mining techniques. It is observed that traditional data mining algorithms are not capable for storage and processing of large scale of data. If some algorithms are capable, then response time is very high. Big data have hidden information, if that is analysed in intelligent manner can be highly beneficial for business organizations. In this paper, we have analysed the advancement from traditional data mining algorithms to Big data mining algorithms. Applications of traditional data mining algorithms can be straight forward incorporated in Big data mining algorithm. Several studies have analysed traditional data mining with Big data mining, but very few have analysed most important algortihsm within one research work, which is the core motive of our paper. Readers can easily observe the difference between these algorthithms with  pros and cons. Mathemtics concepts are applied in data mining algorithms. Means and Euclidean distance calculation in Kmeans, Vectors application and margin in SVM and Bayes therorem, conditional probability in Naïve Bayes algorithm are real examples.  Classification and clustering are the most important applications of data mining. In this paper, Kmeans, SVM and Naïve Bayes algorithms are analysed in detail to observe the accuracy and response time both on concept and empirical perspective. Hadoop, Mapreduce etc. Big data technologies are used for implementing Big data mining algorithms. Performace evaluation metrics like speedup, scaleup and response time are used to compare traditional mining with Big data mining.  


2015 ◽  
Vol 742 ◽  
pp. 395-398
Author(s):  
Chun Ping Wang

Features of large text data mining methods method is avoided semantic analysis from the lexical, syntactic, but by means of statistical analysis and processing large text data, thus maximizing literally ignored similar semantic differences, adapt to network language characteristics. The results of our paper show that data mining algorithms may extract the information in this article can portray the characteristics of vocabulary specific user characteristics and make recommendations based on the characteristics of the user vocabulary.


2014 ◽  
Vol 644-650 ◽  
pp. 1702-1705 ◽  
Author(s):  
Jin Hai Zhang

Beacause internet data has a massive, diverse, heterogeneous, dynamic features, using traditional databases to analyze these data, data storage and processing efficiencies already can not meet the requirements. Utilizing leading-edge distributed computing technology to solve traditional data mining scenarios in lack of data mining of massive data improved data mining algorithm of lot OK Hadoop distributed computing platform, which later on other data mining algorithms using Hadoop to reference while using rich data mining algorithms can be found there is more value in your data.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Jiangang Sun ◽  
Xiaoran Jiang ◽  
Guoliang Yuan ◽  
Zhenhuai Chen

With the continuous improvement of living standards, the level of physical development of adolescents has improved significantly. The physical functions and healthy development of adolescents are relatively slow and even appear to decline. This paper proposes a novel data mining algorithm based on big data for monitoring of adolescent student’s physical health to overcome this problem and enhance young people’s physical fitness and mental health. Since big data technology has positive practical significance in promoting young people’s healthy development and promoting individual health rights, this article will implement commonly used data mining algorithms and Hadoop/Spark big data processing. The algorithm on different platforms verified that the big data platform has good computing performance for the data mining algorithm by comparing the running time. The current work will prove to be a complete physical health data management system and effectively save, process, and analyze adolescents’ physical test data.


Sign in / Sign up

Export Citation Format

Share Document