Nonlinearity Encoding to Improve Extrapolation Capabilities for Unobserved Physical States

Author(s):  
Gyoung S. Na ◽  
Seunghun Jang ◽  
Hyunju Chang

The fundamental goal of machine learning (ML) in physical science is to predict the physical properties of unobserved states. However, an accurate predictionfor input data outside of training distributions is...

Nanoscale ◽  
2021 ◽  
Author(s):  
Hao Zhou ◽  
Ya-Juan Feng ◽  
Chao Wang ◽  
Teng Huang ◽  
Yi-Rong Liu ◽  
...  

Water, the most important molecule on the Earth, possesses many essential and unique physical properties that are far from completely understood, partly due to serious difficulties in identifying the precise...


2021 ◽  
Author(s):  
Yingxian Liu ◽  
Cunliang Chen ◽  
Hanqing Zhao ◽  
Yu Wang ◽  
Xiaodong Han

Abstract Fluid properties are key factors for predicting single well productivity, well test interpretation and oilfield recovery prediction, which directly affect the success of ODP program design. The most accurate and direct method of acquisition is underground sampling. However, not every well has samples due to technical reasons such as excessive well deviation or high cost during the exploration stage. Therefore, analogies or empirical formulas have to be adopted to carry out research in many cases. But a large number of oilfield developments have shown that the errors caused by these methods are very large. Therefore, how to quickly and accurately obtain fluid physical properties is of great significance. In recent years, with the development and improvement of artificial intelligence or machine learning algorithms, their applications in the oilfield have become more and more extensive. This paper proposed a method for predicting crude oil physical properties based on machine learning algorithms. This method uses PVT data from nearly 100 wells in Bohai Oilfield. 75% of the data is used for training and learning to obtain the prediction model, and the remaining 25% is used for testing. Practice shows that the prediction results of the machine learning algorithm are very close to the actual data, with a very small error. Finally, this method was used to apply the preliminary plan design of the BZ29 oilfield which is a new oilfield. Especially for the unsampled sand bodies, the fluid physical properties prediction was carried out. It also compares the influence of the analogy method on the scheme, which provides potential and risk analysis for scheme design. This method will be applied in more oil fields in the Bohai Sea in the future and has important promotion value.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Mohamed Nadir Boucherit ◽  
Fahd Arbaoui

Purpose To constitute input data, the authors carried out electrochemical experiments. The authors performed voltammetric scans in a very cathodic potential region. The authors constituted an experimental table where for each experiment we note the current values recorded at a low polarization range and the pitting potential observed in the anodic region. This study aims to concern carbon steel used in a nuclear installation. The properties of the chemical solutions are close to that of the cooling fluid used in the circuit. Design/methodology/approach In a previous study, this paper demonstrated the effectiveness of machine learning in predicting the localized corrosion resistance of a material by considering as input data the physicochemical properties of its environment (Boucherit et al., 2019). With the present study, the authors improve the results by considering as input data, cathodic currents. The reason of such an approach is to have input data that integrate both the surface state of the material and the physicochemical properties of its environment. Findings The experimental table was submitted to two neural networks, namely, a recurrent network and a convolution network. The convolution network gives better pitting potential predictions. Results also prove that the prediction by observing cathodic currents is better than that obtained by considering the physicochemical properties of the solution. Originality/value The originality of the study lies in the use of cathodic currents as input data. These data contain implicit information on both the chemical environment of the material and its surface condition. This approach appears to be more efficient than considering the chemical composition of the solution as input data. The objective of this study remains, at the same time, to seek the optimal neuronal architectures and the best input data.


2021 ◽  
Vol 6 (22) ◽  
pp. 51-59
Author(s):  
Mustazzihim Suhaidi ◽  
Rabiah Abdul Kadir ◽  
Sabrina Tiun

Extracting features from input data is vital for successful classification and machine learning tasks. Classification is the process of declaring an object into one of the predefined categories. Many different feature selection and feature extraction methods exist, and they are being widely used. Feature extraction, obviously, is a transformation of large input data into a low dimensional feature vector, which is an input to classification or a machine learning algorithm. The task of feature extraction has major challenges, which will be discussed in this paper. The challenge is to learn and extract knowledge from text datasets to make correct decisions. The objective of this paper is to give an overview of methods used in feature extraction for various applications, with a dataset containing a collection of texts taken from social media.


Processes ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 158
Author(s):  
Ain Cheon ◽  
Jwakyung Sung ◽  
Hangbae Jun ◽  
Heewon Jang ◽  
Minji Kim ◽  
...  

The application of a machine learning (ML) model to bio-electrochemical anaerobic digestion (BEAD) is a future-oriented approach for improving process stability by predicting performances that have nonlinear relationships with various operational parameters. Five ML models, which included tree-, regression-, and neural network-based algorithms, were applied to predict the methane yield in BEAD reactor. The results showed that various 1-step ahead ML models, which utilized prior data of BEAD performances, could enhance prediction accuracy. In addition, 1-step ahead with retraining algorithm could improve prediction accuracy by 37.3% compared with the conventional multi-step ahead algorithm. The improvement was particularly noteworthy in tree- and regression-based ML models. Moreover, 1-step ahead with retraining algorithm showed high potential of achieving efficient prediction using pH as a single input data, which is plausibly an easier monitoring parameter compared with the other parameters required in bioprocess models.


Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes


2021 ◽  
Author(s):  
S. H. Al Gharbi ◽  
A. A. Al-Majed ◽  
A. Abdulraheem ◽  
S. Patil ◽  
S. M. Elkatatny

Abstract Due to high demand for energy, oil and gas companies started to drill wells in remote areas and unconventional environments. This raised the complexity of drilling operations, which were already challenging and complex. To adapt, drilling companies expanded their use of the real-time operation center (RTOC) concept, in which real-time drilling data are transmitted from remote sites to companies’ headquarters. In RTOC, groups of subject matter experts monitor the drilling live and provide real-time advice to improve operations. With the increase of drilling operations, processing the volume of generated data is beyond a human's capability, limiting the RTOC impact on certain components of drilling operations. To overcome this limitation, artificial intelligence and machine learning (AI/ML) technologies were introduced to monitor and analyze the real-time drilling data, discover hidden patterns, and provide fast decision-support responses. AI/ML technologies are data-driven technologies, and their quality relies on the quality of the input data: if the quality of the input data is good, the generated output will be good; if not, the generated output will be bad. Unfortunately, due to the harsh environments of drilling sites and the transmission setups, not all of the drilling data is good, which negatively affects the AI/ML results. The objective of this paper is to utilize AI/ML technologies to improve the quality of real-time drilling data. The paper fed a large real-time drilling dataset, consisting of over 150,000 raw data points, into Artificial Neural Network (ANN), Support Vector Machine (SVM) and Decision Tree (DT) models. The models were trained on the valid and not-valid datapoints. The confusion matrix was used to evaluate the different AI/ML models including different internal architectures. Despite the slowness of ANN, it achieved the best result with an accuracy of 78%, compared to 73% and 41% for DT and SVM, respectively. The paper concludes by presenting a process for using AI technology to improve real-time drilling data quality. To the author's knowledge based on literature in the public domain, this paper is one of the first to compare the use of multiple AI/ML techniques for quality improvement of real-time drilling data. The paper provides a guide for improving the quality of real-time drilling data.


Biotechnology ◽  
2019 ◽  
pp. 562-575
Author(s):  
Suraj Sawant

Deep learning (DL) is a method of machine learning, as running over artificial neural networks, which has a structure above the standards to deal with large amounts of data. That is generally because of the increasing amount of data, input data sizes, and of course, greater complexity of objective real-world problems. Performed research studies in the associated literature show that the DL currently has a good performance among considered problems and it seems to be a strong solution for more advanced problems of the future. In this context, this chapter aims to provide some essential information about DL and its applications within the field of biomedical engineering. The chapter is organized as a reference source for enabling readers to have an idea about the relation between DL and biomedical engineering.


2012 ◽  
pp. 1779-1798
Author(s):  
Dumitru Dan Burdescu ◽  
Marian Cristian Mihaescu

Self-assessment is one of the crucial activities within e-learning environments that provide learners with feedback regarding their level of accumulated knowledge. From this point of view, the authors think that guidance of learners in self-assessment activity must be an important goal of e-learning environment developers. The scope of the chapter is to present a recommender software system that runs along the e-learning platform. The recommender software system improves the effectiveness of self-assessment activities. The activities performed by learners represent the input data and the machine learning algorithms are used within the business logic of the recommender software system that runs along the e-learning platform. The output of the recommender software system is represented by advice given to learners in order to improve the effectiveness of self-assessment process. The methodology for obtaining improvement of self-assessment is based on embedding knowledge management into the business logic of the e-learning platform. Naive Bayes Classifier is used as machine learning algorithm for obtaining the resources (e.g., questions, chapters, and concepts) that need to be further accessed by learners. The analysis is accomplished for disciplines that are well structured according to a concept map. The input data set for the recommender software system is represented by student activities that are monitored within Tesys e-learning platform. This platform has been designed and implemented within Multimedia Applications Development Research Center at Software Engineering Department, University of Craiova. Monitoring student activities is accomplished through various techniques like creating log files or adding records into a table from a database. The logging facilities are embedded in the business logic of the e-learning platform. The e-learning platform is based on a software development framework that uses only open source software. The software architecture of the e-learning platform is based on MVC (model-view-controller) model that ensures the independence between the model (represented by MySQL database), the controller (represented by the business logic of the platform implemented in Java) and the view (represented by WebMacro which is a 100% Java open-source template language).


Sign in / Sign up

Export Citation Format

Share Document