Machine learning phases and criticalities without using real data for training

Unmanned Aerial Vehicle (UAV) networks are an emerging technology, useful not only for the military, but also for public and civil purposes. Their versatility provides advantages in situations where an existing network cannot support all requirements of its users, either because of an exceptionally big number of users, or because of the failure of one or more ground base stations. Networks of UAVs can reinforce these cellular networks where needed, redirecting the traffic to available ground stations. Using machine learning algorithms to predict overloaded traffic areas, we propose a UAV positioning algorithm responsible for determining suitable positions for the UAVs, with the objective of a more balanced redistribution of traffic, to avoid saturated base stations and decrease the number of users without a connection. The tests performed with real data of user connections through base stations show that, in less restrictive network conditions, the algorithm to dynamically place the UAVs performs significantly better than in more restrictive conditions, reducing significantly the number of users without a connection. We also conclude that the accuracy of the prediction is a very important factor, not only in the reduction of users without a connection, but also on the number of UAVs deployed.

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text

Kernel Based Data-Adaptive Support Vector Machines for Multi-Class Classification

Mathematics ◽

10.3390/math9090936 ◽

2021 ◽

Vol 9 (9) ◽

pp. 936

Author(s):

Jianli Shao ◽

Xin Liu ◽

Wenqing He

Keyword(s):

Machine Learning ◽

Spatial Association ◽

Class Imbalance ◽

Imbalanced Data ◽

Real Data ◽

Kernel Functions ◽

Support Vector ◽

Classification Problems ◽

Rare Class ◽

Data Adaptive

Imbalanced data exist in many classification problems. The classification of imbalanced data has remarkable challenges in machine learning. The support vector machine (SVM) and its variants are popularly used in machine learning among different classifiers thanks to their flexibility and interpretability. However, the performance of SVMs is impacted when the data are imbalanced, which is a typical data structure in the multi-category classification problem. In this paper, we employ the data-adaptive SVM with scaled kernel functions to classify instances for a multi-class population. We propose a multi-class data-dependent kernel function for the SVM by considering class imbalance and the spatial association among instances so that the classification accuracy is enhanced. Simulation studies demonstrate the superb performance of the proposed method, and a real multi-class prostate cancer image dataset is employed as an illustration. Not only does the proposed method outperform the competitor methods in terms of the commonly used accuracy measures such as the F-score and G-means, but also successfully detects more than 60% of instances from the rare class in the real data, while the competitors can only detect less than 20% of the rare class instances. The proposed method will benefit other scientific research fields, such as multiple region boundary detection.

Download Full-text

Modeling of Psychomotor Reactions of a Person Based on Modification of the Tapping Test

International Journal of Computing ◽

10.47839/ijc.20.2.2166 ◽

2021 ◽

pp. 190-200

Author(s):

Lesia Mochurad ◽

Yaroslav Hladun

Keyword(s):

Neural Network ◽

Machine Learning ◽

Time Series ◽

Real Data ◽

Finger Tapping ◽

Similar Distribution ◽

Model Learning ◽

Machine Learning Model ◽

Finger Tapping Test

The paper considers the method for analysis of a psychophysical state of a person on psychomotor indicators – finger tapping test. The app for mobile phone that generalizes the classic tapping test is developed for experiments. Developed tool allows collecting samples and analyzing them like individual experiments and like dataset as a whole. The data based on statistical methods and optimization of hyperparameters is investigated for anomalies, and an algorithm for reducing their number is developed. The machine learning model is used to predict different features of the dataset. These experiments demonstrate the data structure obtained using finger tapping test. As a result, we gained knowledge of how to conduct experiments for better generalization of the model in future. A method for removing anomalies is developed and it can be used in further research to increase an accuracy of the model. Developed model is a multilayer recurrent neural network that works well with the classification of time series. Error of model learning on a synthetic dataset is 1.5% and on a real data from similar distribution is 5%.

Download Full-text

An Experimental Study of Spammer Detection on Chinese Microblogs

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819402040029x ◽

2020 ◽

Vol 30 (11n12) ◽

pp. 1759-1777

Author(s):

Jialing Liang ◽

Peiquan Jin ◽

Lin Mu ◽

Jie Zhao

Keyword(s):

Machine Learning ◽

Social Media ◽

User Behavior ◽

Real Data ◽

User Profile ◽

Data Set ◽

Sina Weibo ◽

Factors Affecting ◽

The Government ◽

Hot Event

With the development of Web 2.0, social media such as Twitter and Sina Weibo have become an essential platform for disseminating hot events. Simultaneously, due to the free policy of microblogging services, users can post user-generated content freely on microblogging platforms. Accordingly, more and more hot events on microblogging platforms have been labeled as spammers. Spammers will not only hurt the healthy development of social media but also introduce many economic and social problems. Therefore, the government and enterprises must distinguish whether a hot event on microblogging platforms is a spammer or is a naturally-developing event. In this paper, we focus on the hot event list on Sina Weibo and collect the relevant microblogs of each hot event to study the detecting methods of spammers. Notably, we develop an integral feature set consisting of user profile, user behavior, and user relationships to reflect various factors affecting the detection of spammers. Then, we employ typical machine learning methods to conduct extensive experiments on detecting spammers. We use a real data set crawled from the most prominent Chinese microblogging platform, Sina Weibo, and evaluate the performance of 10 machine learning models with five sampling methods. The results in terms of various metrics show that the Random Forest model and the over-sampling method achieve the best accuracy in detecting spammers and non-spammers.

Download Full-text

RON-Gauss: Enhancing Utility in Non-Interactive Private Data Release

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2019-0003 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 26-46 ◽

Cited By ~ 2

Author(s):

Thee Chanyaswad ◽

Changchang Liu ◽

Prateek Mittal

Keyword(s):

Machine Learning ◽

Real World ◽

Differential Privacy ◽

Real Data ◽

The Novel ◽

Private Data ◽

Data Release ◽

Machine Learning Applications ◽

Order Of Magnitude ◽

Real World Datasets

Abstract A key challenge facing the design of differential privacy in the non-interactive setting is to maintain the utility of the released data. To overcome this challenge, we utilize the Diaconis-Freedman-Meckes (DFM) effect, which states that most projections of high-dimensional data are nearly Gaussian. Hence, we propose the RON-Gauss model that leverages the novel combination of dimensionality reduction via random orthonormal (RON) projection and the Gaussian generative model for synthesizing differentially-private data. We analyze how RON-Gauss benefits from the DFM effect, and present multiple algorithms for a range of machine learning applications, including both unsupervised and supervised learning. Furthermore, we rigorously prove that (a) our algorithms satisfy the strong ɛ-differential privacy guarantee, and (b) RON projection can lower the level of perturbation required for differential privacy. Finally, we illustrate the effectiveness of RON-Gauss under three common machine learning applications – clustering, classification, and regression – on three large real-world datasets. Our empirical results show that (a) RON-Gauss outperforms previous approaches by up to an order of magnitude, and (b) loss in utility compared to the non-private real data is small. Thus, RON-Gauss can serve as a key enabler for real-world deployment of privacy-preserving data release.

Download Full-text

IMAGE BASED RECOGNITION OF DYNAMIC TRAFFIC SITUATIONS BY EVALUATING THE EXTERIOR SURROUNDING AND INTERIOR SPACE OF VEHICLES

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-3-w3-161-2015 ◽

2015 ◽

Vol XL-3/W3 ◽

pp. 161-168

Author(s):

A. Hanel ◽

H. Klöden ◽

L. Hoegner ◽

U. Stilla

Keyword(s):

Machine Learning ◽

Real Data ◽

Traffic Situation ◽

Dynamic Traffic ◽

Interior Space ◽

Data Set ◽

Road Users ◽

Vehicle Fleet ◽

New Strategies

Today, cameras mounted in vehicles are used to observe the driver as well as the objects around a vehicle. In this article, an outline of a concept for image based recognition of dynamic traffic situations is shown. A dynamic traffic situation will be described by road users and their intentions. Images will be taken by a vehicle fleet and aggregated on a server. On these images, new strategies for machine learning will be applied iteratively when new data has arrived on the server. The results of the learning process will be models describing the traffic situation and will be transmitted back to the recording vehicles. The recognition will be performed as a standalone function in the vehicles and will use the received models. It can be expected, that this method can make the detection and classification of objects around the vehicles more reliable. In addition, the prediction of their actions for the next seconds should be possible. As one example how this concept is used, a method to recognize the illumination situation of a traffic scene is described. This allows to handle different appearances of objects depending on the illumination of the scene. Different illumination classes will be defined to distinguish different illumination situations. Intensity based features are extracted from the images and used by a classifier to assign an image to an illumination class. This method is being tested for a real data set of daytime and nighttime images. It can be shown, that the illumination class can be classified correctly for more than 80% of the images.

Download Full-text

Machine Learning and Prediction-Based Resource Management in IoT Considering Qos

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1705.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 687-694

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Short Term Memory ◽

Real Data ◽

Computing Methods ◽

Energy Utilization ◽

Real Field ◽

Proposed Model ◽

Improved Performance ◽

Video Sensors

Internet of Things (IoT) is one of the fast-growing technology paradigms used in every sectors, where in the Quality of Service (QoS) is a critical component in such systems and usage perspective with respect to ProSumers (producer and consumers). Most of the recent research works on QoS in IoT have used Machine Learning (ML) techniques as one of the computing methods for improved performance and solutions. The adoption of Machine Learning and its methodologies have become a common trend and need in every technologies and domain areas, such as open source frameworks, task specific algorithms and using AI and ML techniques. In this work we propose an ML based prediction model for resource optimization in the IoT environment for QoS provisioning. The proposed methodology is implemented by using a multi-layer neural network (MNN) for Long Short Term Memory (LSTM) learning in layered IoT environment. Here the model considers the resources like bandwidth and energy as QoS parameters and provides the required QoS by efficient utilization of the resources in the IoT environment. The performance of the proposed model is evaluated in a real field implementation by considering a civil construction project, where in the real data is collected by using video sensors and mobile devices as edge nodes. Performance of the prediction model is observed that there is an improved bandwidth and energy utilization in turn providing the required QoS in the IoT environment.

Download Full-text

Empirical analysis of time series using feature selection algorithms

10.5194/egusphere-egu21-6697 ◽

2021 ◽

Author(s):

Mikhail Kanevski

Keyword(s):

Machine Learning ◽

Time Series ◽

Feature Selection ◽

Dimensional Space ◽

Multivariate Time Series ◽

Feature Space ◽

Real Data ◽

Theoretical Concepts ◽

Wide Range ◽

Linear And Nonlinear

<p>Nowadays a wide range of methods and tools to study and forecast time series is available. An important problem in forecasting concerns embedding of time series, i.e. construction of a high dimensional space where forecasting problem is considered as a regression task. There are several basic linear and nonlinear approaches of constructing such space by defining an optimal delay vector using different theoretical concepts. Another way is to consider this space as an input feature space &#8211; IFS, and to apply machine learning feature selection (FS) algorithms to optimize IFS according to the problem under study (analysis, modelling or forecasting). Such approach is an empirical one: it is based on data and depends on the FS algorithms applied. In machine learning features are generally classified as relevant, redundant and irrelevant. It gives a reach possibility to perform advanced multivariate time series exploration and development of interpretable predictive models.</p><p>Therefore, in the present research different FS algorithms are used to analyze fundamental properties of time series from empirical point of view. Linear and nonlinear simulated time series are studied in detail to understand the advantages and drawbacks of the proposed approach. Real data case studies deal with air pollution and wind speed times series. Preliminary results are quite promising and more research is in progress.</p>

Download Full-text

Forecasting Of Covid-19 Cases Using Machine Learning Approach

Current Respiratory Medicine Reviews ◽

10.2174/1573398x17666210129131009 ◽

2021 ◽

Vol 17 ◽

Author(s):

Sachin Kumar ◽

Karan Veer

Keyword(s):

Machine Learning ◽

Regression Model ◽

Model Performance ◽

Real Data ◽

Absolute Error ◽

Viral Disease ◽

Support Vector ◽

Family Welfare ◽

Accuracy Score ◽

Learning Approaches

Aims: The objective of this research is to predict the covid-19 cases in India based on the machine learning approaches. Background: Covid-19, a respiratory disease caused by one of the coronavirus family members, has led to a pandemic situation worldwide in 2020. This virus was detected firstly in Wuhan city of China in December 2019. This viral disease has taken less than three months to spread across the globe. Objective: In this paper, we proposed a regression model based on the Support vector machine (SVM) to forecast the number of deaths, the number of recovered cases, and total confirmed cases for the next 30 days. Method: For prediction, the data is collected from Github and the ministry of India's health and family welfare from March 14, 2020, to December 3, 2020. The model has been designed in Python 3.6 in Anaconda to forecast the forecasting value of corona trends until September 21, 2020. The proposed methodology is based on the prediction of values using SVM based regression model with polynomial, linear, rbf kernel. The dataset has been divided into train and test datasets with 40% and 60% test size and verified with real data. The model performance parameters are evaluated as a mean square error, mean absolute error, and percentage accuracy. Results and Conclusion: The results show that the polynomial model has obtained 95 % above accuracy score, linear scored above 90%, and rbf scored above 85% in predicting cumulative death, conformed cases, and recovered cases.

Download Full-text