scholarly journals Effect of river flow on the quality of estuarine and coastal waters using machine learning models

2018 ◽  
Vol 12 (1) ◽  
pp. 810-823 ◽  
Author(s):  
Mohamad Javad Alizadeh ◽  
Mohamad Reza Kavianpour ◽  
Malihe Danesh ◽  
Jason Adolf ◽  
Shahabbodin Shamshirband ◽  
...  
Author(s):  
Noé Sturm ◽  
Jiangming Sun ◽  
Yves Vandriessche ◽  
Andreas Mayr ◽  
Günter Klambauer ◽  
...  

<div>This article describes an application of high-throughput fingerprints (HTSFP) built upon industrial data accumulated over the years. </div><div>The fingerprint was used to build machine learning models (multi-task deep learning + SVM) for compound activity predictions towards a panel of 131 targets. </div><div>Quality of the predictions and the scaffold hopping potential of the HTSFP were systematically compared to traditional structural descriptors ECFP. </div><div><br></div>


Author(s):  
Jože M. Rožanec ◽  
Elena Trajkova ◽  
Jinzhi Lu ◽  
Nikolaos Sarantinoudis ◽  
Georgios Arampatzis ◽  
...  

Refineries execute a series of interlinked processes, where the product of one unit serves as the input to another process. Potential failures within these processes affect the quality of the end products, operational efficiency, and revenue of the entire refinery. In this context, implementation of a real-time cognitive module, referring to predictive machine learning models, enables to provide equipment state monitoring services and to generate decision-making for equipment operations. In this paper, we propose two machine learning models: 1) to forecast the amount of pentane (C5) content in the final product mixture; 2) to identify if C5 content exceeds the specification thresholds for the final product quality. We validate our approach by using a use case from a real-world refinery. In addition, we develop a visualization to assess which features are considered most important during feature selection, and later by the machine learning models. Finally, we provide insights on the sensor values in the dataset, which help to identify the operational conditions for using such machine learning models.


2021 ◽  
Vol 11 (24) ◽  
pp. 11790
Author(s):  
Jože Martin Rožanec ◽  
Elena Trajkova ◽  
Jinzhi Lu ◽  
Nikolaos Sarantinoudis ◽  
George Arampatzis ◽  
...  

Refineries execute a series of interlinked processes, where the product of one unit serves as the input to another process. Potential failures within these processes affect the quality of the end products, operational efficiency, and revenue of the entire refinery. In this context, implementation of a real-time cognitive module, referring to predictive machine learning models, enables the provision of equipment state monitoring services and the generation of decision-making for equipment operations. In this paper, we propose two machine learning models: (1) to forecast the amount of pentane (C5) content in the final product mixture; (2) to identify if C5 content exceeds the specification thresholds for the final product quality. We validate our approach using a use case from a real-world refinery. In addition, we develop a visualization to assess which features are considered most important during feature selection, and later by the machine learning models. Finally, we provide insights on the sensor values in the dataset, which help to identify the operational conditions for using such machine learning models.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


2021 ◽  
Author(s):  
Andrew McDonald ◽  

Decades of subsurface exploration and characterisation have led to the collation and storage of large volumes of well related data. The amount of data gathered daily continues to grow rapidly as technology and recording methods improve. With the increasing adoption of machine learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools. If the input data is of poor quality, the impact on precision and accuracy of the prediction can be significant. Consequently, this can impact key decisions about the future of a well or a field. This study focuses on well log data, which can be highly multi-dimensional, diverse and stored in a variety of file formats. Well log data exhibits key characteristics of Big Data: Volume, Variety, Velocity, Veracity and Value. Well data can include numeric values, text values, waveform data, image arrays, maps, volumes, etc. All of which can be indexed by time or depth in a regular or irregular way. A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine learning models. Well log data can be affected by numerous issues causing a degradation in data quality. These include missing data - ranging from single data points to entire curves; noisy data from tool related issues; borehole washout; processing issues; incorrect environmental corrections; and mislabelled data. Having vast quantities of data does not mean it can all be passed into a machine learning algorithm with the expectation that the resultant prediction is fit for purpose. It is essential that the most important and relevant data is passed into the model through appropriate feature selection techniques. Not only does this improve the quality of the prediction, it also reduces computational time and can provide a better understanding of how the models reach their conclusion. This paper reviews data quality issues typically faced by petrophysicists when working with well log data and deploying machine learning models. First, an overview of machine learning and Big Data is covered in relation to petrophysical applications. Secondly, data quality issues commonly faced with well log data are discussed. Thirdly, methods are suggested on how to deal with data issues prior to modelling. Finally, multiple case studies are discussed covering the impacts of data quality on predictive capability.


2018 ◽  
Author(s):  
Milko Krachunov ◽  
Milena Sokolova ◽  
Valeriya Simeonova ◽  
Maria Nisheva ◽  
Irena Avdjieva ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Stephanie J. Eder ◽  
Andrew A. Nicholson ◽  
Michal M. Stefanczyk ◽  
Michał Pieniak ◽  
Judit Martínez-Molina ◽  
...  

The COVID-19 pandemic along with the restrictions that were introduced within Europe starting in spring 2020 allows for the identification of predictors for relationship quality during unstable and stressful times. The present study began as strict measures were enforced in response to the rising spread of the COVID-19 virus within Austria, Poland, Spain and Czech Republic. Here, we investigated quality of romantic relationships among 313 participants as movement restrictions were implemented and subsequently phased out cross-nationally. Participants completed self-report questionnaires over a period of 7 weeks, where we predicted relationship quality and change in relationship quality using machine learning models that included a variety of potential predictors related to psychological, demographic and environmental variables. On average, our machine learning models predicted 29% (linear models) and 22% (non-linear models) of the variance with regard to relationship quality. Here, the most important predictors consisted of attachment style (anxious attachment being more influential than avoidant), age, and number of conflicts within the relationship. Interestingly, environmental factors such as the local severity of the pandemic did not exert a measurable influence with respect to predicting relationship quality. As opposed to overall relationship quality, the change in relationship quality during lockdown restrictions could not be predicted accurately by our machine learning models when utilizing our selected features. In conclusion, we demonstrate cross-culturally that attachment security is a major predictor of relationship quality during COVID-19 lockdown restrictions, whereas fear, pathogenic threat, sexual behavior, and the severity of governmental regulations did not significantly influence the accuracy of prediction.


T-Comm ◽  
2020 ◽  
Vol 14 (10) ◽  
pp. 53-60
Author(s):  
Oleg I. Sheluhin ◽  
◽  
Valentina P. Ivannikova ◽  

A comparative analysis of statistical and model-based methods for selecting the quantity and the composition of informative features was performed using the UNSW-NB15 database for machine learning models training for attack detection. Feature selection is one of the most important steps in data preparation for machine learning tasks. It allows to increase a quality of machine learning models: it reduces sizes of the fitted models, training time and probability of overfitting. The research was conducted using Python programming language libraries: scikit-learn, which includes various machine learning models and functions for data preparation and models estimation, and FeatureSelector, which contains functions for statistical data analysis. Numerical results of experimental research of application of both statistical methods of features selection and machine learning models-based methods are provided. As the result, the reduced set of features is obtained, which allows improving the quality of classification by removing noise features that have little effect on the final result and reducing the quantity of informative features of the data set from 41 to 17. It is shown that the most effective among the analyzed methods for feature selection is the statistical method SelectKBest with the function chi2, which allows to obtain a reduced set of features providing an accuracy of classification as high as 90% in comparation with 74% provided with the full set.


Sign in / Sign up

Export Citation Format

Share Document