Prediction of Wind Speed Using Real Data: An Analysis of Statistical Machine Learning Techniques

This chapter aims to introduce the common methods and practices of statistical machine learning techniques. It contains the development of algorithms, applications of algorithms and also the ways by which they learn from the observed data by building models. In turn, these models can be used to predict. Although one assumes that machine learning and statistics are not quite related to each other, it is evident that machine learning and statistics go hand in hand. We observe how the methods used in statistics such as linear regression and classification are made use of in machine learning. We also take a look at the implementation techniques of classification and regression techniques. Although machine learning provides standard libraries to implement tons of algorithms, we take a look on how to tune the algorithms and what parameters of the algorithm or the features of the algorithm affect the performance of the algorithm based on the statistical methods.

Download Full-text

Development of a Simulation Prediction System Using Statistical Machine Learning Techniques

KIPS Transactions on Software and Data Engineering ◽

10.3745/ktsde.2016.5.11.593 ◽

2016 ◽

Vol 5 (11) ◽

pp. 593-606

Author(s):

Ki Yong Lee ◽

YoonJae Shin ◽

YeonJeong Choe ◽

SeonJeong Kim ◽

Young-Kyoon Suh ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Prediction System ◽

Statistical Machine Learning ◽

Learning Techniques

Download Full-text

Analysis of Intrusion Detection and Classification using Machine Learning Approaches

SMART MOVES JOURNAL IJOSCIENCE ◽

10.24113/ijoscience.v3i10.13 ◽

2017 ◽

Vol 3 (10) ◽

Author(s):

Anjum Khan ◽

Anjana Nigam

Keyword(s):

Machine Learning ◽

Network Security ◽

Intrusion Detection ◽

Detection System ◽

Real Data ◽

Machine Learning Techniques ◽

Learning Approaches ◽

High Detection Rate ◽

Learning Techniques ◽

Result Analysis

As the network primarily based applications are growing quickly, the network security mechanisms need a lot of attention to enhance speed and preciseness. The ever evolving new intrusion types cause a significant threat to network security. Though varied network security tools are developed, however the quick growth of intrusive activities continues to be a significant issue. Intrusion detection systems (IDSs) are wont to detect intrusive activities on the network. Analysis showed that application of machine learning techniques in intrusion detection might reach high detection rate. Machine learning and classification algorithms facilitate to design “Intrusion Detection Models” which might classify the network traffic into intrusive or traditional traffic. This paper discusses some usually used machine learning techniques in Intrusion Detection System and conjointly reviews a number of the prevailing machine learning IDS proposed by researchers at different times. in this paper an experimental analysis is performed to demonstrate the performance analysis of some existing techniques in order that they will be used further in developing Hybrid Classifier for real data packets classification. The given result analysis shows that KNN, RF and SVM performs best for NSL-KDD dataset.

Download Full-text

Deep Learning for the derivation of GNSS Reflectometry global ocean wind speed

10.5194/egusphere-egu21-4665 ◽

2021 ◽

Author(s):

Milad Asgarimehr ◽

Caroline Arnold ◽

Felix Stiehler ◽

Tobias Weigel ◽

Chris Ruf ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Wind Speed ◽

Sampling Rate ◽

Machine Learning Techniques ◽

Global Ocean ◽

Deep Model ◽

Learning Techniques ◽

Unseen Data

The Global Navigation Satellite System Reflectometry (GNSS-R) is a novel remote sensing technique exploiting GNSS signals after reflection off the Earth's surface. The capability of spaceborne GNSS-R to monitor ocean state and the surface wind is recently well demonstrated, which offers an unprecedented sampling rate and much robustness during rainfall. The Cyclone GNSS (CyGNSS) is the first spaceborne mission fully dedicated to GNSS-R, launched in December 2016.Thanks to the low development costs of the GNSS-R satellite missions as well as the capability of tracking multiple reflected signals from numerous GNSS transmitters, the GNSS-R datasets are much bigger compared to those from conventional remote sensing techniques. The CyGNSS provides a high number of unique samples in the order of a few millions monthly.&#160; Deep learning can therefore be implemented in GNSS-R even more efficiently than other remote sensing domains. With the upcoming GNSS-R CubeSats, the data volume is expected to increase in the near future and GNSS-R &#8220;Big data&#8221; can be a future challenge. Deep learning methods are additionally able to correct the potential effects, both technical and geophysical, dictated by data empirically when the mechanisms are not well described by the theoretical knowledge. This poses the question if GNSS-R should embrace deep learning and can benefit from this modern data scientific method like other Earth Observation domains.The receivers onboard CyGNSS cross-correlate the reflected signals received at a nadir antenna to a locally generated replica. The cross-correlation power at a range of the signal delay and Doppler frequency shift is the observational output of the receivers being called delay-Doppler Maps (DDMs). The mapped power is inversely proportional to the ocean roughness and consequently surface winds.Few recent studies innovatively show some merits of machine learning techniques for the derivations of ocean winds from the DDMs. However, the capability of machine learning techniques, especially deep learning for an operational data derivation needs to be better characterized. Normally, the operational retrieval algorithms are developed based on an existing dataset and are supposed to operate on the upcoming measurements. Therefore, machine learning-based models are supposed to generalize well on the unseen data in future periods. Herein, we aim at the characterization of deep learning capabilities for these GNSS-R operational purposes.In this interdisciplinary study, we present a deep learning algorithm processing the CyGNSS measurements to derive wind speed data. The model is supposed to meet an acceptable level of generalization on the upcoming unseen data, and alternatively can be used as an operational processing algorithm. We propose a deep model based on convolutional and fully connected layers processing the DDMs besides ancillary input features. The model leads to the so-far best quality of global wind speed estimates using GNSS-R measurements with a general root mean square error of 1.3 m/s over unseen data in a time span different from that of the training data.

Download Full-text

Determining the parameters of high amplification microlensing events by means of statistical machine learning techniques

Proceedings of the International Astronomical Union ◽

10.1017/s1743921316012977 ◽

2016 ◽

Vol 12 (S325) ◽

pp. 213-216

Author(s):

Elena Fedorova

Keyword(s):

Machine Learning ◽

Dark Matter ◽

Machine Learning Techniques ◽

Statistical Machine Learning ◽

Density Profiles ◽

Important Clue ◽

Gravitational Microlensing ◽

Learning Techniques

AbstractStrong gravitational microlensing (GM) events provide us a possibility to determine both the parameters of microlensed source and microlens. GM can be an important clue to understand the nature of dark matter on comparably small spatial and mass scales (i.e. substructure), especially when speaking about the combination of astrometrical and photometrical data about high amplification microlensing events (HAME). In the same time, fitting of HAME lightcurves of microlensed sources is quite time-consuming process. That is why we test here the possibility to apply the statistical machine learning techniques to determine the source and microlens parameters for the set of HAME lightcurves, using the simulated set of amplification curves of sources microlensed by point masses and clumps of DM with various density profiles.

Download Full-text

Don’t Dismiss Logistic Regression: The Case for Sensible Extraction of Interactions in the Era of Machine Learning

10.1101/2019.12.15.877134 ◽

2019 ◽

Cited By ~ 1

Author(s):

Joshua J. Levy ◽

A. James O’Malley

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Model Building ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Statistical Machine Learning ◽

Forest Model ◽

Learning Techniques ◽

Modeling Techniques

AbstractBackgroundMachine learning approaches have become increasingly popular modeling techniques, relying on data-driven heuristics to arrive at its solutions. Recent comparisons between these algorithms and traditional statistical modeling techniques have largely ignored the superiority gained by the former approaches due to involvement of model-building search algorithms. This has led to alignment of statistical and machine learning approaches with different types of problems and the under-development of procedures that combine their attributes. In this context, we hoped to understand the domains of applicability for each approach and to identify areas where a marriage between the two approaches is warranted. We then sought to develop a hybrid statistical-machine learning procedure with the best attributes of each.MethodsWe present three simple examples to illustrate when to use each modeling approach and posit a general framework for combining them into an enhanced logistic regression model building procedure that aids interpretation. We study 556 benchmark machine learning datasets to uncover when machine learning techniques outperformed rudimentary logistic regression models and so are potentially well-equipped to enhance them. We illustrate a software package, InteractionTransformer, which embeds logistic regression with advanced model building capacity by using machine learning algorithms to extract candidate interaction features from a random forest model for inclusion in the model. Finally, we apply our enhanced logistic regression analysis to two real-word biomedical examples, one where predictors vary linearly with the outcome and another with extensive second-order interactions.ResultsPreliminary statistical analysis demonstrated that across 556 benchmark datasets, the random forest approach significantly outperformed the logistic regression approach. We found a statistically significant increase in predictive performance when using hybrid procedures and greater clarity in the association with the outcome of terms acquired compared to directly interpreting the random forest output.ConclusionsWhen a random forest model is closer to the true model, hybrid statistical-machine learning procedures can substantially enhance the performance of statistical procedures in an automated manner while preserving easy interpretation of the results. Such hybrid methods may help facilitate widespread adoption of machine learning techniques in the biomedical setting.

Download Full-text

Prediction of defects using machine learning techniques in order to improve quality management system – A case study

MATEC Web of Conferences ◽

10.1051/matecconf/202134305010 ◽

2021 ◽

Vol 343 ◽

pp. 05010

Author(s):

Adina Sârb ◽

Cristina Burja Udrea ◽

Daniela Nagy – Oniţa ◽

Liliana Itul ◽

Maria Popa

Keyword(s):

Machine Learning ◽

Quality Management ◽

Management System ◽

Quality Management System ◽

Real Data ◽

Machine Learning Techniques ◽

Data Sets ◽

Learning Techniques ◽

Different Types ◽

The Future

According to ISO 9000, a quality management system is part of a set of related or interacting elements of an organization that sets policies and objectives, as well as the processes necessary to achieve the quality objectives. Quality is the extent to which a set of intrinsic characteristics of an object meets the requirements. Based on these definitions, the factory, considered in this paper, S.C. APULUM S.A.,decided to implement a quality management system since 1998. Subsequently, the organization’s attention is focus on the continuous improvement of the implemented quality management system. The purpose of this paper is to study the percent of specified defects specific to ceramic products in the future to improve the quality management system. In this regard, machine learning techniques were applied for defects forecasting for different types of products: mugs, pressed plates and jiggered plates. The experimental evaluation was performed on real data sets that contain percentages about different types of defects collected in 2018-2019. The experimental results show that for each type of product exists an algorithm that forecasts the future defects.

Download Full-text