Gaussian Distribution-Based Machine Learning Scheme for Anomaly Detection in Healthcare Sensor Cloud

2021 ◽  
Vol 11 (1) ◽  
pp. 52-72
Author(s):  
Rajendra Kumar Dwivedi ◽  
Rakesh Kumar ◽  
Rajkumar Buyya

Smart information systems are based on sensors that generate a huge amount of data. This data can be stored in cloud for further processing and efficient utilization. Anomalous data might be present within the sensor data due to various reasons (e.g., malicious activities by intruders, low quality sensors, and node deployment in harsh environments). Anomaly detection is crucial in some applications such as healthcare monitoring systems, forest fire information systems, and other internet of things (IoT) systems. This paper proposes a Gaussian distribution-based supervised machine learning scheme of anomaly detection (GDA) for healthcare monitoring sensor cloud, which is an integration of various body sensors of different patients and cloud. This work is implemented in Python. Use of Gaussian statistical model in the proposed scheme improves precision, throughput, and efficiency. GDA provides 98% efficiency with 3% and 4% improvements as compared to the other supervised learning-based anomaly detection schemes (e.g., support vector machine [SVM] and self-organizing map [SOM], respectively).

2020 ◽  
Author(s):  
Yutao Lu ◽  
Juan Wang ◽  
Miao Liu ◽  
Kaixuan Zhang ◽  
Guan Gui ◽  
...  

The ever-increasing amount of data in cellular networks poses challenges for network operators to monitor the quality of experience (QoE). Traditional key quality indicators (KQIs)-based hard decision methods are difficult to undertake the task of QoE anomaly detection in the case of big data. To solve this problem, in this paper, we propose a KQIs-based QoE anomaly detection framework using semi-supervised machine learning algorithm, i.e., iterative positive sample aided one-class support vector machine (IPS-OCSVM). There are four steps for realizing the proposed method while the key step is combining machine learning with the network operator's expert knowledge using OCSVM. Our proposed IPS-OCSVM framework realizes QoE anomaly detection through soft decision and can easily fine-tune the anomaly detection ability on demand. Moreover, we prove that the fluctuation of KQIs thresholds based on expert knowledge has a limited impact on the result of anomaly detection. Finally, experiment results are given to confirm the proposed IPS-OCSVM framework for QoE anomaly detection in cellular networks.


2020 ◽  
Author(s):  
Yutao Lu ◽  
Juan Wang ◽  
Miao Liu ◽  
Kaixuan Zhang ◽  
Guan Gui ◽  
...  

The ever-increasing amount of data in cellular networks poses challenges for network operators to monitor the quality of experience (QoE). Traditional key quality indicators (KQIs)-based hard decision methods are difficult to undertake the task of QoE anomaly detection in the case of big data. To solve this problem, in this paper, we propose a KQIs-based QoE anomaly detection framework using semi-supervised machine learning algorithm, i.e., iterative positive sample aided one-class support vector machine (IPS-OCSVM). There are four steps for realizing the proposed method while the key step is combining machine learning with the network operator's expert knowledge using OCSVM. Our proposed IPS-OCSVM framework realizes QoE anomaly detection through soft decision and can easily fine-tune the anomaly detection ability on demand. Moreover, we prove that the fluctuation of KQIs thresholds based on expert knowledge has a limited impact on the result of anomaly detection. Finally, experiment results are given to confirm the proposed IPS-OCSVM framework for QoE anomaly detection in cellular networks.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2019 ◽  
Vol 23 (1) ◽  
pp. 12-21 ◽  
Author(s):  
Shikha N. Khera ◽  
Divya

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3827
Author(s):  
Gemma Urbanos ◽  
Alberto Martín ◽  
Guillermo Vázquez ◽  
Marta Villanueva ◽  
Manuel Villa ◽  
...  

Hyperspectral imaging techniques (HSI) do not require contact with patients and are non-ionizing as well as non-invasive. As a consequence, they have been extensively applied in the medical field. HSI is being combined with machine learning (ML) processes to obtain models to assist in diagnosis. In particular, the combination of these techniques has proven to be a reliable aid in the differentiation of healthy and tumor tissue during brain tumor surgery. ML algorithms such as support vector machine (SVM), random forest (RF) and convolutional neural networks (CNN) are used to make predictions and provide in-vivo visualizations that may assist neurosurgeons in being more precise, hence reducing damages to healthy tissue. In this work, thirteen in-vivo hyperspectral images from twelve different patients with high-grade gliomas (grade III and IV) have been selected to train SVM, RF and CNN classifiers. Five different classes have been defined during the experiments: healthy tissue, tumor, venous blood vessel, arterial blood vessel and dura mater. Overall accuracy (OACC) results vary from 60% to 95% depending on the training conditions. Finally, as far as the contribution of each band to the OACC is concerned, the results obtained in this work are 3.81 times greater than those reported in the literature.


2021 ◽  
Vol 11 (10) ◽  
pp. 4443
Author(s):  
Rokas Štrimaitis ◽  
Pavel Stefanovič ◽  
Simona Ramanauskaitė ◽  
Asta Slotkienė

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).


2020 ◽  
Vol 2020 ◽  
pp. 1-14 ◽  
Author(s):  
Randa Aljably ◽  
Yuan Tian ◽  
Mznah Al-Rodhaan

Nowadays, user’s privacy is a critical matter in multimedia social networks. However, traditional machine learning anomaly detection techniques that rely on user’s log files and behavioral patterns are not sufficient to preserve it. Hence, the social network security should have multiple security measures to take into account additional information to protect user’s data. More precisely, access control models could complement machine learning algorithms in the process of privacy preservation. The models could use further information derived from the user’s profiles to detect anomalous users. In this paper, we implement a privacy preservation algorithm that incorporates supervised and unsupervised machine learning anomaly detection techniques with access control models. Due to the rich and fine-grained policies, our control model continuously updates the list of attributes used to classify users. It has been successfully tested on real datasets, with over 95% accuracy using Bayesian classifier, and 95.53% on receiver operating characteristic curve using deep neural networks and long short-term memory recurrent neural network classifiers. Experimental results show that this approach outperforms other detection techniques such as support vector machine, isolation forest, principal component analysis, and Kolmogorov–Smirnov test.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hyeon-Kyu Park ◽  
Jae-Hyeok Lee ◽  
Jehyun Lee ◽  
Sang-Koog Kim

AbstractThe macroscopic properties of permanent magnets and the resultant performance required for real implementations are determined by the magnets’ microscopic features. However, earlier micromagnetic simulations and experimental studies required relatively a lot of work to gain any complete and comprehensive understanding of the relationships between magnets’ macroscopic properties and their microstructures. Here, by means of supervised learning, we predict reliable values of coercivity (μ0Hc) and maximum magnetic energy product (BHmax) of granular NdFeB magnets according to their microstructural attributes (e.g. inter-grain decoupling, average grain size, and misalignment of easy axes) based on numerical datasets obtained from micromagnetic simulations. We conducted several tests of a variety of supervised machine learning (ML) models including kernel ridge regression (KRR), support vector regression (SVR), and artificial neural network (ANN) regression. The hyper-parameters of these models were optimized by a very fast simulated annealing (VFSA) algorithm with an adaptive cooling schedule. In our datasets of randomly generated 1,000 polycrystalline NdFeB cuboids with different microstructural attributes, all of the models yielded similar results in predicting both μ0Hc and BHmax. Furthermore, some outliers, which deteriorated the normality of residuals in the prediction of BHmax, were detected and further analyzed. Based on all of our results, we can conclude that our ML approach combined with micromagnetic simulations provides a robust framework for optimal design of microstructures for high-performance NdFeB magnets.


SPE Journal ◽  
2021 ◽  
pp. 1-13
Author(s):  
Utkarsh Sinha ◽  
Birol Dindoruk ◽  
Mohamed Soliman

Summary Minimum miscibility pressure (MMP) is one of the key design parameters for gas injection projects. It is a physical parameter that is a measure of local displacement efficiency while subject to some constraints due to its definition. Also, the MMP value is used to tune compositional models along with proper fluid description constrained with other available basic phase behavior data, such as bubble point pressure and volumetric properties. In general, carbon dioxide (CO2) and hydrocarbon gases are the most common gases used for (or screened for) gas injection processes, and because of recent focus, they are used to screen for the coupling of CO2-sequestration and CO2-enhanced oil recovery (EOR) projects. Because the CO2/oil phase behavior is quite different than the hydrocarbon gas/oil phase behavior, researchers developed specialized correlations for CO2 or CO2-rich streams. Therefore, there is a need for a tool with expanded range capabilities for the estimation of MMP for CO2 gas streams. The only known and widely accepted measurement technique for MMP that is coherent with its formal definition is the use of a slimtube apparatus. However, the use of slimtube restricts the amount of data available, even though there are other alternative techniques presented over the last three decades, which all have various limitations (Dindoruk et al. 2021). Due to some of the complexities highlighted in Dindoruk et al. (2021) and time and resource requirements, there have been a number of correlations developed in the literature using mostly classical regression techniques with relatively sparse data using various combinations of limited input data (Cronquist 1978; Lee 1979; Yellig and Metcalfe 1980; Alston et al. 1985; Glaso 1985; Jaubert et al. 1998; Emera and Sarma 2005; Yuan et al. 2005; Ahmadi et al. 2010; Ahmadi and Johns 2011). In this paper, we present two separate approaches for the calculation of the MMP of an oil for CO2 injection: analytical correlation in which the correlation coefficients were tuned using linear support vector machines (SVMs) (Press et al. 2007; MathWorks 2020; RDocumentation 2020b; Cortes and Vapnik 1995) and using a hybrid method (i.e., superlearner model), which consists of the combination of random forest (RF) regression (Breiman 2001) and the proposed analytical correlation. Both models take the compositional analysis of oils up to heptane plus fraction, molecular weight of oil, and the reservoir temperature as input parameters. Based on statistical and data analysis techniques in combination with the help of corresponding crossplots, we showed that the performance of the final proposed method (hybrid method) is superior to all the leading correlations (Cronquist 1978; Lee 1979; Yellig and Metcalfe 1980; Alston et al. 1985; Glaso 1985; Emera and Sarma 2005; Yuan et al. 2005) and supervised machine-learning (Metcalfe 1982) methods considered in the literature (Altman 1992; Chambers and Hastie 1992; Chapelle and Vapnik 2000; Breiman 2001; Press et al. 2007; MathWorks 2020). The proposed model works for the widest spectrum of MMPs from 1,000 to 4,900 psia, which covers the entire range of oils within the scope of CO2 EOR based on the widely used screening criteria (Taber et al. 1997a, 1997b).


Sign in / Sign up

Export Citation Format

Share Document