Implementation of the solution to the oil displacement problem using machine learning classifiers and neural networks

The problem of oil displacement was solved using neural networks and machine learning classifiers. The Buckley-Leverett model is selected, which describes the process of oil displacement by water. It consists of the equation of continuity of oil, water phases and Darcy’s law. The challenge is to optimize the oil displacement problem. Optimization will be performed at three levels: vectorization of calculations; implementation of classical algorithms; implementation of the algorithm using neural networks. A feature of the method proposed in the work is the identification of the method with high accuracy and the smallest errors, comparing the results of machine learning classifiers and types of neural networks. The research paper is also one of the first papers in which a comparison was made with machine learning classifiers and neural and recurrent neural networks. The classification was carried out according to three classification algorithms, such as decision tree, support vector machine (SVM) and gradient boosting. As a result of the study, the Gradient Boosting classifier and the neural network showed high accuracy, respectively 99.99 % and 97.4 %. The recurrent neural network trained faster than the others. The SVM classifier has the lowest accuracy score. To achieve this goal, a dataset was created containing over 67,000 data for class 10. These data are important for the problems of oil displacement in porous media. The proposed methodology provides a simple and elegant way to instill oil knowledge into machine learning algorithms. This removes two of the most significant drawbacks of machine learning algorithms: the need for large datasets and the robustness of extrapolation. The presented principles can be generalized in countless ways in the future and should lead to a new class of algorithms for solving both forward and inverse oil problems

Download Full-text

EMOTIONS RECOGNITION IN HUMAN SPEECH USING DEEP NEURAL NETWORKS

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.01.pp.044-051 ◽

2021 ◽

pp. 44-51

Author(s):

E. Yu. Shchetinin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Audio Recordings ◽

Computer Studies

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.

Download Full-text

Development and Evaluation of the Combined Machine Learning Models for the Prediction of Dam Inflow

Water ◽

10.3390/w12102927 ◽

2020 ◽

Vol 12 (10) ◽

pp. 2927

Author(s):

Jiyeong Hong ◽

Seoro Lee ◽

Joo Hyun Bae ◽

Jimin Lee ◽

Woon Ji Park ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Multilayer Perceptron ◽

Short Term Memory ◽

Learning Algorithms ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Dam Inflow

Predicting dam inflow is necessary for effective water management. This study created machine learning algorithms to predict the amount of inflow into the Soyang River Dam in South Korea, using weather and dam inflow data for 40 years. A total of six algorithms were used, as follows: decision tree (DT), multilayer perceptron (MLP), random forest (RF), gradient boosting (GB), recurrent neural network–long short-term memory (RNN–LSTM), and convolutional neural network–LSTM (CNN–LSTM). Among these models, the multilayer perceptron model showed the best results in predicting dam inflow, with the Nash–Sutcliffe efficiency (NSE) value of 0.812, root mean squared errors (RMSE) of 77.218 m3/s, mean absolute error (MAE) of 29.034 m3/s, correlation coefficient (R) of 0.924, and determination coefficient (R2) of 0.817. However, when the amount of dam inflow is below 100 m3/s, the ensemble models (random forest and gradient boosting models) performed better than MLP for the prediction of dam inflow. Therefore, two combined machine learning (CombML) models (RF_MLP and GB_MLP) were developed for the prediction of the dam inflow using the ensemble methods (RF and GB) at precipitation below 16 mm, and the MLP at precipitation above 16 mm. The precipitation of 16 mm is the average daily precipitation at the inflow of 100 m3/s or more. The results show the accuracy verification results of NSE 0.857, RMSE 68.417 m3/s, MAE 18.063 m3/s, R 0.927, and R2 0.859 in RF_MLP, and NSE 0.829, RMSE 73.918 m3/s, MAE 18.093 m3/s, R 0.912, and R2 0.831 in GB_MLP, which infers that the combination of the models predicts the dam inflow the most accurately. CombML algorithms showed that it is possible to predict inflow through inflow learning, considering flow characteristics such as flow regimes, by combining several machine learning algorithms.

Download Full-text

Machine learning models for predicting protein condensate formation from sequence determinants and embeddings

10.1101/2020.10.26.354753 ◽

2020 ◽

Author(s):

Kadi L. Saar ◽

Alexey S. Morgunov ◽

Runzhang Qi ◽

William E. Arter ◽

Georg Krainer ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Phase Separation ◽

Test Data ◽

Protein Sequence ◽

High Accuracy ◽

Phase Behaviour ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Condensate Formation

AbstractIntracellular phase separation of proteins into biomolecular condensates is increasingly recognised as an important phenomenon for cellular compartmentalisation and regulation of biological function. Different hypotheses about the parameters that determine the tendency of proteins to form condensates have been proposed with some of them probed experimentally through the use of constructs generated by sequence alterations. To broaden the scope of these observations, here, we established an in silico strategy for understanding on a global level the associations between protein sequence and condensate formation, and used this information to construct machine learning classifiers for predicting liquid–liquid phase separation (LLPS) from protein sequence. Our analysis highlighted that LLPS–prone sequences are more disordered, hydrophobic and of lower Shannon entropy than sequences in the Protein Data Bank or the Swiss-Prot database, and have their disordered regions enriched in polar, aromatic and charged residues. Using these determining features together with neural network based word2vec sequence embeddings, we developed machine learning classifiers for predicting protein condensate formation. Our model, trained to distinguish LLPS-prone sequences from structured proteins, achieved high accuracy (93%; 25-fold cross-validation) and identified condensate forming sequences from external independent test data at 97% sensitivity. Moreover, in combination with a classifier that had developed a nuanced insight into the features governing protein phase behaviour by learning to distinguish between sequences of varying LLPS propensity, the sensitivity was supplemented with high specificity (approximated ROC–AUC of 0.85). These results provide a platform rooted in molecular principles for understanding protein phase behaviour. The predictor is accessible from https://deephase.ch.cam.ac.uk/.Significance StatementThe tendency of many cellular proteins to form protein-rich biomolecular condensates underlies the formation of subcellular compartments and has been linked to various physiological functions. Understanding the molecular basis of this fundamental process and predicting protein phase behaviour have therefore become important objectives. To develop a global understanding of how protein sequence determines its phase behaviour, here, we constructed bespoke datasets of proteins of varying phase separation propensity and identified explicit biophysical and sequence-specific features common to phase separating proteins. Moreover, by combining this insight with neural network based sequence embeddings, we trained machine learning classifiers that identified phase separating sequences with high accuracy, including from independent external test data. The predictor is available from https://deephase.ch.cam.ac.uk/.

Download Full-text

Review of machine learning algorithms' application in pharmaceutical technology

Arhiv za farmaciju ◽

10.5937/arhfarm71-32499 ◽

2021 ◽

Vol 71 (4) ◽

pp. 302-317

Author(s):

Jelena Đuriš ◽

Ivana Kurćubić ◽

Svetlana Ibrić

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Formulation Development ◽

Light Gradient ◽

Pharmaceutical Technology ◽

Wide Range

Machine learning algorithms, and artificial intelligence in general, have a wide range of applications in the field of pharmaceutical technology. Starting from the formulation development, through a great potential for integration within the Quality by design framework, these data science tools provide a better understanding of the pharmaceutical formulations and respective processing. Machine learning algorithms can be especially helpful with the analysis of the large volume of data generated by the Process analytical technologies. This paper provides a brief explanation of the artificial neural networks, as one of the most frequently used machine learning algorithms. The process of the network training and testing is described and accompanied with illustrative examples of machine learning tools applied in the context of pharmaceutical formulation development and related technologies, as well as an overview of the future trends. Recently published studies on more sophisticated methods, such as deep neural networks and light gradient boosting machine algorithm, have been described. The interested reader is also referred to several official documents (guidelines) that pave the way for a more structured representation of the machine learning models in their prospective submissions to the regulatory bodies.

Download Full-text

Using multiple machine learning algorithms to classify elite and sub-elite goalkeepers in professional men’s football

Scientific Reports ◽

10.1038/s41598-021-01187-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mikael Jamil ◽

Ashwin Phatak ◽

Saumya Mehta ◽

Marco Beato ◽

Daniel Memmert ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

High Accuracy ◽

Machine Learning Algorithms ◽

Training Data ◽

Gradient Boosting ◽

Common Features ◽

Testing Data ◽

Elite Level ◽

Logistic Regressions

AbstractThis study applied multiple machine learning algorithms to classify the performance levels of professional goalkeepers (GK). Technical performances of GK’s competing in the elite divisions of England, Spain, Germany, and France were analysed in order to determine which factors distinguish elite GK’s from sub-elite GK’s. A total of (n = 14,671) player-match observations were analysed via multiple machine learning algorithms (MLA); Logistic Regressions (LR), Gradient Boosting Classifiers (GBC) and Random Forest Classifiers (RFC). The results revealed 15 common features across the three MLA’s pertaining to the actions of passing and distribution, distinguished goalkeepers performing at the elite level from those that do not. Specifically, short distribution, passing the ball successfully, receiving passes successfully, and keeping clean sheets were all revealed to be common traits of GK’s performing at the elite level. Moderate to high accuracy was reported across all the MLA’s for the training data, LR (0.7), RFC (0.82) and GBC (0.71) and testing data, LR (0.67), RFC (0.66) and GBC (0.66). Ultimately, the results discovered in this study suggest that a GK’s ability with their feet and not necessarily their hands are what distinguishes the elite GK’s from the sub-elite.

Download Full-text

Comparison between the different Artificial Neural Network (ANN) accuracy in diagnosis of asthma: مقارنة بين اختلاف دقة الشبكات العصبية الاصطناعية في تشخيص مرض الربو

Journal of engineering sciences and information technology - مجلة العلوم الهندسية و تكنولوجيا المعلومات ◽

10.26389/ajsrp.n260421 ◽

2021 ◽

Vol 5 (4) ◽

pp. 172-165

Author(s):

Hanein Omar Mohamed, Basma.F.Idris Hanein Omar Mohamed, Basma.F.Idris

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Learning Algorithms ◽

High Accuracy ◽

Machine Learning Algorithms ◽

Support Vector ◽

Data Mining Algorithms ◽

Artificial Neural

Asthma is a chronic disease that is caused by inflammation of airways. Diagnosis, predication and classification of asthmatic are one of the major attractive areas of research for decades by using different and recent techniques, however the main problem of asthma is misdiagnosis. This paper simplifies and compare between different Artificial Neural Network techniques used to solve this problem by using different algorithms to getting a high level of accuracyin diagnosis, prediction, and classification of asthma like: (data mining algorithms, machine learning algorithms, deep machine learning algorithms), depending and passing through three stages: data acquisition, feature extracting, data classification. According to the comparison of different techniques the high accuracy achieved by ANN was (98.85%), and the low accuracy of it was (80%), despite of the accuracy achieved by Support Vector Machine (SVM) was (86%) when used Mel Frequency Cepstral Coefficient MFCC for feature extraction, while the accuracy was (99.34%) when used Relief for extracting feature. Based in our comparison we recommend that if the researchers used the same techniques they should to return to previous studies it to get high accuracy.

Download Full-text

THE EFFECT OF DATASETS ON BREAST CANCER DETECTION MODELS

FUDMA Journal of Sciences ◽

10.33003/fjs-2020-0404-487 ◽

2021 ◽

Vol 4 (4) ◽

pp. 309-315

Author(s):

Kumawuese Jennifer Kurugh ◽

Muhammad Aminu Ahmad ◽

Awwal Ahmad Babajo

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Algorithms ◽

Cancer Classification ◽

Machine Learning Algorithms ◽

Support Vector ◽

Breast Cancer Dataset ◽

Machine Learning Classifiers ◽

Breast Cancer Classification ◽

Learning Classifiers

Datasets are a major requirement in the development of breast cancer classification/detection models using machine learning algorithms. These models can provide an effective, accurate and less expensive diagnosis method and reduce life losses. However, using the same machine learning algorithms on different datasets yields different results. This research developed several machine learning models for breast cancer classification/detection using Random forest, support vector machine, K Nearest Neighbors, Gaussian Naïve Bayes, Perceptron and Logistic regression. Three widely used test data sets were used; Wisconsin Breast Cancer (WBC) Original, Wisconsin Diagnostic Breast Cancer (WDBC) and Wisconsin Prognostic Breast Cancer (WPBC). The results show that datasets affect the performance of machine learning classifiers. Also, the machine learning classifiers have different performances with a given breast cancer dataset

Download Full-text

Privacy-preserving Machine Learning as a Service

Proceedings on Privacy Enhancing Technologies ◽

10.1515/popets-2018-0024 ◽

2018 ◽

Vol 2018 (3) ◽

pp. 123-142 ◽

Cited By ~ 29

Author(s):

Ehsan Hesamifard ◽

Hassan Takabi ◽

Mehdi Ghasemi ◽

Rebecca N. Wright

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Deep Neural Network ◽

Learning Algorithms ◽

Privacy Preserving ◽

Machine Learning Algorithms ◽

Cloud Services ◽

Network Algorithms ◽

Encrypted Data

Abstract Machine learning algorithms based on deep Neural Networks (NN) have achieved remarkable results and are being extensively used in different domains. On the other hand, with increasing growth of cloud services, several Machine Learning as a Service (MLaaS) are offered where training and deploying machine learning models are performed on cloud providers’ infrastructure. However, machine learning algorithms require access to the raw data which is often privacy sensitive and can create potential security and privacy risks. To address this issue, we present CryptoDL, a framework that develops new techniques to provide solutions for applying deep neural network algorithms to encrypted data. In this paper, we provide the theoretical foundation for implementing deep neural network algorithms in encrypted domain and develop techniques to adopt neural networks within practical limitations of current homomorphic encryption schemes. We show that it is feasible and practical to train neural networks using encrypted data and to make encrypted predictions, and also return the predictions in an encrypted form. We demonstrate applicability of the proposed CryptoDL using a large number of datasets and evaluate its performance. The empirical results show that it provides accurate privacy-preserving training and classification.

Download Full-text

Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells

Frontiers in Pharmacology ◽

10.3389/fphar.2021.796534 ◽

2021 ◽

Vol 12 ◽

Author(s):

Shuyun He ◽

Duancheng Zhao ◽

Yanle Ling ◽

Hanxuan Cai ◽

Yike Cai ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Breast Cancer Cell Lines ◽

Support Vector ◽

Local Version ◽

Extreme Gradient Boosting

Breast cancer (BC) has surpassed lung cancer as the most frequently occurring cancer, and it is the leading cause of cancer-related death in women. Therefore, there is an urgent need to discover or design new drug candidates for BC treatment. In this study, we first collected a series of structurally diverse datasets consisting of 33,757 active and 21,152 inactive compounds for 13 breast cancer cell lines and one normal breast cell line commonly used in in vitro antiproliferative assays. Predictive models were then developed using five conventional machine learning algorithms, including naïve Bayesian, support vector machine, k-Nearest Neighbors, random forest, and extreme gradient boosting, as well as five deep learning algorithms, including deep neural networks, graph convolutional networks, graph attention network, message passing neural networks, and Attentive FP. A total of 476 single models and 112 fusion models were constructed based on three types of molecular representations including molecular descriptors, fingerprints, and graphs. The evaluation results demonstrate that the best model for each BC cell subtype can achieve high predictive accuracy for the test sets with AUC values of 0.689–0.993. Moreover, important structural fragments related to BC cell inhibition were identified and interpreted. To facilitate the use of the model, an online webserver called ChemBC (http://chembc.idruglab.cn/) and its local version software (https://github.com/idruglab/ChemBC) were developed to predict whether compounds have potential inhibitory activity against BC cells.

Download Full-text

APPLICATION OF MACHINE LEARNING METHODS TO APPROXIMATE THE EXPERIMENTAL CHARACTERISTICS OF A MEMRISTOR

Mathematical modeling in materials science of electronic component ◽

10.29003/m1536.mmmsec-2020/116-119 ◽

2020 ◽

Author(s):

V. Lopatenko

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Scientific Community ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Learning Methods ◽

Passive Element ◽

Machine Learning Methods ◽

Boosting Algorithm

Memristor is a passive element in microelectronics, similar in its properties to a biological synapse. The possibility of using a memristor as an analog element in neural networks increases the interest of the scientific community in the study of its properties. In this paper, we study the possibility of modeling some characteristics of a memristor using machine learning algorithms, in particular, the gradient boosting algorithm.

Download Full-text