RSSI Data Preparation for Machine Learning

Author(s):  
Dodo Zaenal Abidin ◽  
Siti Nurmaini ◽  
Reza Firsandava Malik ◽  
Erwin ◽  
Errissya Rasywir ◽  
...  
2021 ◽  
Author(s):  
Jerome Asedegbega ◽  
Oladayo Ayinde ◽  
Alexander Nwakanma

Abstract Several computer-aided techniques have been developed in recent past to improve interpretational accuracy of subsurface geology. This paradigm shift has provided tremendous success in variety of Machine Learning Application domains and help for better feasibility study in reservoir evaluation using multiple classification techniques. Facies classification is an essential subsurface exploration task as sedimentary facies reflect associated physical, chemical, and biological conditions that formation unit experienced during sedimentation activity. This study however, employed formation samples for facies classification using Machine Learning (ML) techniques and classified different facies from well logs in seven (7) wells of the PORT Field, Offshore Niger Delta. Six wells were concatenated during data preparation and trained using supervised ML algorithms before validating the models by blind testing on one well log to predict discrete facies groups. The analysis started with data preparation and examination where various features of the available well data were conditioned. For the model building and performance, support vector machine, random forest, decision tree, extra tree, neural network (multilayer preceptor), k-nearest neighbor and logistic regression model were built after dividing the data sets into training, test, and blind test well data. Results of metric score for the blind test well estimated for the various models using Jaccard index and F1-score indicated 0.73 and 0.82 for support vector machine, 0.38 and 0.54 for random forest, 0.78 and 0.83 for extra tree, 0.91 and 0.95 for k-nearest neighbor, 0.41 and 0.56 for decision tree, 0.63 and 0.74 for logistic regression, 0.55 and 0.68 for neural network, respectively. The efficiency of ML techniques for enhancing the prediction accuracy and decreasing the procedure time and their approach toward the data, makes it importantly desirable to recommend them in subsurface facies classification analysis.


Author(s):  
Mahamah Sebakor

Is it strange that the spanning tree protocol (STP) has been the only thing used to defend the Layer-2 backbone against looping? Do we trust it? For several decades, the campus backbone has often been an unsuspected problem, one of which is STP failure. Meanwhile, the MAC address flapping is probably a feasible issue for modern network fabrics. According to the serious Layer-2 issues, particularly the legacy switches extended STP design, this work uses the notion of a software-defined network fashion to evaluate the traditional and modern networks. Through the MAC address lookup of all bridge devices, this work proposes the Layer-2 evaluation system (LES), which uses a novel approach known as support supervised learning to create the data preparation for machine learning. Additionally, the LES enabled network administrators to determine their backbones. This study is intended to evaluate the potential slowdown network caused by MAC address problems. Furthermore, this work investigates the proposed method in a real network, and it also covers the evaluation and performance of our proposed method.


2021 ◽  
Author(s):  
Anton Goretsky ◽  
Anastasia Dmitrienko ◽  
Irene Tang ◽  
Nicolae Lari ◽  
Owen Kunhardt ◽  
...  

In 2010, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) started the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be (nuMoM2b), a prospective cohort study of a racially/ethnically/geographically diverse population of nulliparous women with singleton gestation. The nuMoM2b is a very large dataset, consisting of data for 10,038 patients with over 4,600 features per patient, spread out over 80 files. In this report, we share our experience preparing and working with this dataset. We present our data preprocessing of the nuMoM2b dataset to get a deeper understanding of the data, extract the most relevant features, make the fewest assumptions when filling in unknown values, and reducing the dimensionality of the data. We hope this report is useful to researchers interested in building machine learning and statistical models from the nuMoM2b dataset.


2021 ◽  
pp. 105-116
Author(s):  
A. M. KOZIN ◽  
◽  
A. D. LYKOV ◽  
I. A. VYAZANKIN ◽  
A. S. VYAZANKIN ◽  
...  

The “Middle Atmosphere” Regional Information and Analytic Center (Central Aerological Observatory) works out algorithms for analyzing the quality of aerological data based on machine learning methods. Different approaches to the data preparation are described, the examples of data that were rejected using standard approaches are given, the ways to develop and improve the quality of aerological information transmitted to the WMO international network are outlined.


Author(s):  
Stamatios-Aggelos N. Alexandropoulos ◽  
Sotiris B. Kotsiantis ◽  
Michael N. Vrahatis

AbstractA large variety of issues influence the success of data mining on a given problem. Two primary and important issues are the representation and the quality of the dataset. Specifically, if much redundant and unrelated or noisy and unreliable information is presented, then knowledge discovery becomes a very difficult problem. It is well-known that data preparation steps require significant processing time in machine learning tasks. It would be very helpful and quite useful if there were various preprocessing algorithms with the same reliable and effective performance across all datasets, but this is impossible. To this end, we present the most well-known and widely used up-to-date algorithms for each step of data preprocessing in the framework of predictive data mining.


Author(s):  
Y. Lathasree ◽  
G. Mamatha

This paper proposes a preparation of quality data for training accurate machine learning model. Data preparation is very important in machine learning. Here we are preparing the data for air pollu­tion forecast. As Air pollution forecasting has tradi­tionally been done by physical models of the atmos­phere, which are unstable and in accurate for large pe­riods of time. Since machine learning techniques are more robust to perturbations, in this paper we explore the data preparation and applications of machine learning to air pollution forecasting to potentially gen­erate more accurate predictions. A Linear Regression model is used to train the data a more accurately and predict the air pollution.


AI & Society ◽  
2021 ◽  
Author(s):  
Jan Kaiser ◽  
German Terrazas ◽  
Duncan McFarlane ◽  
Lavindra de Silva

AbstractMachine learning (ML) is increasingly used to enhance production systems and meet the requirements of a rapidly evolving manufacturing environment. Compared to larger companies, however, small- and medium-sized enterprises (SMEs) lack in terms of resources, available data and skills, which impedes the potential adoption of analytics solutions. This paper proposes a preliminary yet general approach to identify low-cost analytics solutions for manufacturing SMEs, with particular emphasis on ML. The initial studies seem to suggest that, contrarily to what is usually thought at first glance, SMEs seldom need digital solutions that use advanced ML algorithms which require extensive data preparation, laborious parameter tuning and a comprehensive understanding of the underlying problem. If an analytics solution does require learning capabilities, a ‘simple solution’, which we will characterise in this paper, should be sufficient.


2020 ◽  
Author(s):  
Manuel Muñoz-Aguirre ◽  
Vasilis F. Ntasis ◽  
Roderic Guigó

AbstractThe development of increasingly sophisticated methods to acquire high resolution images has led to the generation of large collections of biomedical imaging data, including images of tissues and organs. Many of the current machine learning methods that aim to extract biological knowledge from histopathological images require several data preprocessing stages, creating an overhead before the proper analysis. Here we present PyHIST (https://github.com/manuel-munoz-aguirre/PyHIST), an easy-to-use, open source whole slide histological image tissue segmentation and preprocessing tool aimed at data preparation for machine learning applications.


Sign in / Sign up

Export Citation Format

Share Document