Gaussian Distribution-Based Machine Learning Scheme for Anomaly Detection in Healthcare Sensor Cloud

Smart information systems are based on sensors that generate a huge amount of data. This data can be stored in cloud for further processing and efficient utilization. Anomalous data might be present within the sensor data due to various reasons (e.g., malicious activities by intruders, low quality sensors, and node deployment in harsh environments). Anomaly detection is crucial in some applications such as healthcare monitoring systems, forest fire information systems, and other internet of things (IoT) systems. This paper proposes a Gaussian distribution-based supervised machine learning scheme of anomaly detection (GDA) for healthcare monitoring sensor cloud, which is an integration of various body sensors of different patients and cloud. This work is implemented in Python. Use of Gaussian statistical model in the proposed scheme improves precision, throughput, and efficiency. GDA provides 98% efficiency with 3% and 4% improvements as compared to the other supervised learning-based anomaly detection schemes (e.g., support vector machine [SVM] and self-organizing map [SOM], respectively).

Download Full-text

Semi-supervised Machine Learning Aided Anomaly Detection Method in Cellular Networks

10.36227/techrxiv.11634720 ◽

2020 ◽

Author(s):

Yutao Lu ◽

Juan Wang ◽

Miao Liu ◽

Kaixuan Zhang ◽

Guan Gui ◽

...

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Cellular Networks ◽

Expert Knowledge ◽

Learning Algorithm ◽

Positive Sample ◽

Supervised Machine Learning ◽

Support Vector ◽

Soft Decision ◽

Decision Methods

The ever-increasing amount of data in cellular networks poses challenges for network operators to monitor the quality of experience (QoE). Traditional key quality indicators (KQIs)-based hard decision methods are difficult to undertake the task of QoE anomaly detection in the case of big data. To solve this problem, in this paper, we propose a KQIs-based QoE anomaly detection framework using semi-supervised machine learning algorithm, i.e., iterative positive sample aided one-class support vector machine (IPS-OCSVM). There are four steps for realizing the proposed method while the key step is combining machine learning with the network operator's expert knowledge using OCSVM. Our proposed IPS-OCSVM framework realizes QoE anomaly detection through soft decision and can easily fine-tune the anomaly detection ability on demand. Moreover, we prove that the fluctuation of KQIs thresholds based on expert knowledge has a limited impact on the result of anomaly detection. Finally, experiment results are given to confirm the proposed IPS-OCSVM framework for QoE anomaly detection in cellular networks.

Download Full-text

Semi-supervised Machine Learning Aided Anomaly Detection Method in Cellular Networks

10.36227/techrxiv.11634720.v1 ◽

2020 ◽

Author(s):

Yutao Lu ◽

Juan Wang ◽

Miao Liu ◽

Kaixuan Zhang ◽

Guan Gui ◽

...

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Cellular Networks ◽

Expert Knowledge ◽

Learning Algorithm ◽

Positive Sample ◽

Supervised Machine Learning ◽

Support Vector ◽

Soft Decision ◽

Decision Methods

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques

Vision The Journal of Business Perspective ◽

10.1177/0972262918821221 ◽

2019 ◽

Vol 23 (1) ◽

pp. 12-21 ◽

Cited By ~ 2

Author(s):

Shikha N. Khera ◽

Divya

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Confusion Matrix ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

It Industry ◽

Knowledge Based ◽

Employee Attrition

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.

Download Full-text

Supervised Machine Learning Methods and Hyperspectral Imaging Techniques Jointly Applied for Brain Cancer Classification

Sensors ◽

10.3390/s21113827 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3827

Author(s):

Gemma Urbanos ◽

Alberto Martín ◽

Guillermo Vázquez ◽

Marta Villanueva ◽

Manuel Villa ◽

...

Keyword(s):

Machine Learning ◽

Blood Vessel ◽

Hyperspectral Imaging ◽

Imaging Techniques ◽

Venous Blood ◽

Healthy Tissue ◽

Supervised Machine Learning ◽

Support Vector ◽

Arterial Blood

Hyperspectral imaging techniques (HSI) do not require contact with patients and are non-ionizing as well as non-invasive. As a consequence, they have been extensively applied in the medical field. HSI is being combined with machine learning (ML) processes to obtain models to assist in diagnosis. In particular, the combination of these techniques has proven to be a reliable aid in the differentiation of healthy and tumor tissue during brain tumor surgery. ML algorithms such as support vector machine (SVM), random forest (RF) and convolutional neural networks (CNN) are used to make predictions and provide in-vivo visualizations that may assist neurosurgeons in being more precise, hence reducing damages to healthy tissue. In this work, thirteen in-vivo hyperspectral images from twelve different patients with high-grade gliomas (grade III and IV) have been selected to train SVM, RF and CNN classifiers. Five different classes have been defined during the experiments: healthy tissue, tumor, venous blood vessel, arterial blood vessel and dura mater. Overall accuracy (OACC) results vary from 60% to 95% depending on the training conditions. Finally, as far as the contribution of each band to the OACC is concerned, the results obtained in this work are 3.81 times greater than those reported in the literature.

Download Full-text

Financial Context News Sentiment Analysis for the Lithuanian Language

Applied Sciences ◽

10.3390/app11104443 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4443

Author(s):

Rokas Štrimaitis ◽

Pavel Stefanovič ◽

Simona Ramanauskaitė ◽

Asta Slotkienė

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Experimental Investigations ◽

Support Vector ◽

Applied Machine Learning ◽

Bayes Algorithm ◽

Website Content

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

Download Full-text

Preserving Privacy in Multimedia Social Networks Using Machine Learning Anomaly Detection

Security and Communication Networks ◽

10.1155/2020/5874935 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Randa Aljably ◽

Yuan Tian ◽

Mznah Al-Rodhaan

Keyword(s):

Machine Learning ◽

Social Networks ◽

Access Control ◽

Anomaly Detection ◽

Privacy Preservation ◽

Support Vector ◽

Detection Techniques ◽

Access Control Models ◽

Control Models ◽

Multimedia Social Networks

Nowadays, user’s privacy is a critical matter in multimedia social networks. However, traditional machine learning anomaly detection techniques that rely on user’s log files and behavioral patterns are not sufficient to preserve it. Hence, the social network security should have multiple security measures to take into account additional information to protect user’s data. More precisely, access control models could complement machine learning algorithms in the process of privacy preservation. The models could use further information derived from the user’s profiles to detect anomalous users. In this paper, we implement a privacy preservation algorithm that incorporates supervised and unsupervised machine learning anomaly detection techniques with access control models. Due to the rich and fine-grained policies, our control model continuously updates the list of attributes used to classify users. It has been successfully tested on real datasets, with over 95% accuracy using Bayesian classifier, and 95.53% on receiver operating characteristic curve using deep neural networks and long short-term memory recurrent neural network classifiers. Experimental results show that this approach outperforms other detection techniques such as support vector machine, isolation forest, principal component analysis, and Kolmogorov–Smirnov test.

Download Full-text

Optimizing machine learning models for granular NdFeB magnets by very fast simulated annealing

Scientific Reports ◽

10.1038/s41598-021-83315-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hyeon-Kyu Park ◽

Jae-Hyeok Lee ◽

Jehyun Lee ◽

Sang-Koog Kim

Keyword(s):

Machine Learning ◽

Simulated Annealing ◽

Permanent Magnets ◽

Supervised Machine Learning ◽

Support Vector ◽

Micromagnetic Simulations ◽

Ndfeb Magnets ◽

Average Grain Size ◽

Macroscopic Properties ◽

Very Fast Simulated Annealing

AbstractThe macroscopic properties of permanent magnets and the resultant performance required for real implementations are determined by the magnets’ microscopic features. However, earlier micromagnetic simulations and experimental studies required relatively a lot of work to gain any complete and comprehensive understanding of the relationships between magnets’ macroscopic properties and their microstructures. Here, by means of supervised learning, we predict reliable values of coercivity (μ0Hc) and maximum magnetic energy product (BHmax) of granular NdFeB magnets according to their microstructural attributes (e.g. inter-grain decoupling, average grain size, and misalignment of easy axes) based on numerical datasets obtained from micromagnetic simulations. We conducted several tests of a variety of supervised machine learning (ML) models including kernel ridge regression (KRR), support vector regression (SVR), and artificial neural network (ANN) regression. The hyper-parameters of these models were optimized by a very fast simulated annealing (VFSA) algorithm with an adaptive cooling schedule. In our datasets of randomly generated 1,000 polycrystalline NdFeB cuboids with different microstructural attributes, all of the models yielded similar results in predicting both μ0Hc and BHmax. Furthermore, some outliers, which deteriorated the normality of residuals in the prediction of BHmax, were detected and further analyzed. Based on all of our results, we can conclude that our ML approach combined with micromagnetic simulations provides a robust framework for optimal design of microstructures for high-performance NdFeB magnets.

Download Full-text

Prediction of CO2 Minimum Miscibility Pressure Using an Augmented Machine-Learning-Based Model

SPE Journal ◽

10.2118/200326-pa ◽

2021 ◽

pp. 1-13

Author(s):

Utkarsh Sinha ◽

Birol Dindoruk ◽

Mohamed Soliman

Keyword(s):

Machine Learning ◽

Phase Behavior ◽

Hybrid Method ◽

Gas Injection ◽

Supervised Machine Learning ◽

Design Parameters ◽

Support Vector ◽

Hydrocarbon Gases ◽

Minimum Miscibility Pressure ◽

Analytical Correlation

Summary Minimum miscibility pressure (MMP) is one of the key design parameters for gas injection projects. It is a physical parameter that is a measure of local displacement efficiency while subject to some constraints due to its definition. Also, the MMP value is used to tune compositional models along with proper fluid description constrained with other available basic phase behavior data, such as bubble point pressure and volumetric properties. In general, carbon dioxide (CO2) and hydrocarbon gases are the most common gases used for (or screened for) gas injection processes, and because of recent focus, they are used to screen for the coupling of CO2-sequestration and CO2-enhanced oil recovery (EOR) projects. Because the CO2/oil phase behavior is quite different than the hydrocarbon gas/oil phase behavior, researchers developed specialized correlations for CO2 or CO2-rich streams. Therefore, there is a need for a tool with expanded range capabilities for the estimation of MMP for CO2 gas streams. The only known and widely accepted measurement technique for MMP that is coherent with its formal definition is the use of a slimtube apparatus. However, the use of slimtube restricts the amount of data available, even though there are other alternative techniques presented over the last three decades, which all have various limitations (Dindoruk et al. 2021). Due to some of the complexities highlighted in Dindoruk et al. (2021) and time and resource requirements, there have been a number of correlations developed in the literature using mostly classical regression techniques with relatively sparse data using various combinations of limited input data (Cronquist 1978; Lee 1979; Yellig and Metcalfe 1980; Alston et al. 1985; Glaso 1985; Jaubert et al. 1998; Emera and Sarma 2005; Yuan et al. 2005; Ahmadi et al. 2010; Ahmadi and Johns 2011). In this paper, we present two separate approaches for the calculation of the MMP of an oil for CO2 injection: analytical correlation in which the correlation coefficients were tuned using linear support vector machines (SVMs) (Press et al. 2007; MathWorks 2020; RDocumentation 2020b; Cortes and Vapnik 1995) and using a hybrid method (i.e., superlearner model), which consists of the combination of random forest (RF) regression (Breiman 2001) and the proposed analytical correlation. Both models take the compositional analysis of oils up to heptane plus fraction, molecular weight of oil, and the reservoir temperature as input parameters. Based on statistical and data analysis techniques in combination with the help of corresponding crossplots, we showed that the performance of the final proposed method (hybrid method) is superior to all the leading correlations (Cronquist 1978; Lee 1979; Yellig and Metcalfe 1980; Alston et al. 1985; Glaso 1985; Emera and Sarma 2005; Yuan et al. 2005) and supervised machine-learning (Metcalfe 1982) methods considered in the literature (Altman 1992; Chambers and Hastie 1992; Chapelle and Vapnik 2000; Breiman 2001; Press et al. 2007; MathWorks 2020). The proposed model works for the widest spectrum of MMPs from 1,000 to 4,900 psia, which covers the entire range of oils within the scope of CO2 EOR based on the widely used screening criteria (Taber et al. 1997a, 1997b).

Download Full-text

Predictive Analytics of Sensor Data Based on Supervised Machine Learning Algorithms

2017 International Conference on Next Generation Computing and Information Systems (ICNGCIS) ◽

10.1109/icngcis.2017.12 ◽

2017 ◽

Cited By ~ 4

Author(s):

Shreya Gupta ◽

Mohit Mittal ◽

Anupama Padha

Keyword(s):

Machine Learning ◽

Predictive Analytics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Sensor Data

Download Full-text